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Foreword 



On behalf of the organizing committee I would like to welcome you all to the 
second Asian Symposium on Programming Languages and Systems (APLAS 
2004) held in Taipei on November 4-6, 2004. Since the year 2000, researchers 
in the area of programming languages and systems have been meeting annually 
in Asia to present their most recent research results, thus contributing to the 
advancement of this research area. The last four meetings were held in Singapore 
(2000), Daejeon (2001), Shanghai (2002), and Beijing (2003). These meetings 
were very fruitful and provided an excellent venue for the exchange of research 
ideas, findings and experiences in programming languages and systems. APLAS 
2004 is the fifth such meeting and the second one in symposium setting. The 
first symposium was held in Beijing last year. 

The success of the APLAS series is the collective result of many people’s 
contributions. For APLAS 2004, first I would like to thank all the members of the 
Program Committee, in particular the Program Chair Wei-Ngan Chin, for their 
hard work in putting together an excellent program. I am most grateful to invited 
speakers, Joxan Jaffar, Frank Pfenning, and Martin Odersky, who have traveled 
a long way to deliver their speeches at APLAS 2004. I would like to thank all 
the referees, who helped review the manuscripts, the authors, who contributed 
to the proceedings of APLAS 2004, the members of the Organizing Committee, 
who made considerable effort to organize this event, and all the participants 
present at this meeting. Without your support this symposium would not have 
been possible. Finally I would like to acknowledge the support of the Asian 
Association for Foundation of Software and Academia Sinica, Taiwan. 

I am sure you will enjoy this meeting, and I hope you will also find time to 
do some sightseeing in Taipei and take back some fond memories of your visit 
after this meeting is over. 



September 2004 



D. T. Lee 




Preface 



This volume contains the proceedings of the 2nd Asian Symposium on Pro- 
gramming Languages and Systems (APLAS 2004). The symposium was held in 
Taipei, Taiwan and was sponsored by the Asian Association for Foundation of 
Software (AAFS) and the Academia Sinica. 

Following our call for papers, 97 full submissions were received. Almost all 
the papers were reviewed by three (or more) program committee members with 
the help of external reviewers. The program committee met electronically over 
a 10-day period and accepted 26 papers after careful deliberations. I would like 
to thank members of the APLAS 2004 Program Committee, for the tremendous 
effort they put into their reviews and deliberations, and all the external reviewers 
for their invaluable contributions. 

The final program covered both foundational and practical issues in program- 
ming languages and systems. Apart from the 26 accepted papers, the sympo- 
sium also featured invited talks from three distinguished speakers, Joxan Jaffar 
(National University of Singapore), Frank Pfenning (Carnegie Mellon Univer- 
sity, USA) and Martin Odersky (Ecole Polytechnique Federale cle Lausanne, 
Switzerland). 

Many people helped to promote APLAS as a high-quality forum in Asia to 
serve programming language researchers worldwide. Following a series of well- 
attended workshops that were held in Singapore (2000), Daejeon (2001), and 
Shanghai (2002), the first formal symposium was successfully held in Beijing in 
2003. The present symposium benefited from the past momentum, and was also 
due to the contributions of many people. 

Foremost, I am grateful to the General Chair, D. T. Lee, for his invalu- 
able support and guidance, making our symposium in Taipei possible. I am also 
indebted to our Local Arrangements Chair, Tyng-Ruey Clruang, for the consid- 
erable effort he put into planning and organizing the meeting itself. Hidelriko 
Masulrara kindly agreed to act as Poster Chair, and Slrengclrao Qin helped with 
publicity matters. From the AAFS Committee, I would like to especially thank 
Atsushi Ohori and Tetsuo Ida for providing sound advice. Last but not least, I 
thank Florin Craciun for his dedication in handling the CyberChair submissions 
system and other administrative matters. 
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Wei-Ngan Chin 




Organization 



General Chair 

D.T. Lee (Academia Sinica, Taiwan) 

Program Chair 

Wei-Ngan Chin (National University of Singapore) 

Program Committee 

Jifeng He (United Nations University, Macau) 

Thomas Henzinger (University of California, Berkeley, USA) 
Yuh-Jzer Joung (National Taiwan University, Taiwan) 
Gabriele Keller (University of New South Wales, Australia) 
Jenq-Kuen Lee (National Tsinghua University, Taiwan) 

Luc Maranget (INRIA, France) 

Hidehiko Masulrara (University of Tokyo, Japan) 

Luke Ong (University of Oxford, UK) 

Tamiya Onodera (IBM Research, Japan) 

Zongyan Qiu (Peking University, China) 

Martin Rinard (Massachusetts Institute of Technology, USA) 
David Sands (Chalmers University of Technology, Sweden) 
Akilriko Takano (National Institute of Informatics, Japan) 
Kazunori Ueda (Waseda University, Japan) 

Clrengyong Wu (Chinese Academy of Science, China) 
Hongwei Xi (Boston University, USA) 

Kwangkeun Yi (Seoul National University, Korea) 



Local Arrangements Chair 

Tyng-R.uey Chuang (Academia Sinica, Taiwan) 



Poster Chair 

Hidehiko Masulrara (University of Tokyo, Japan) 

Publicity 

Slrengclrao Qin (National University of Singapore) 




X 



Organization 



External Referees 

Seika Abe 
Joonseon Ahn 
Ki-yung Ahn 
Wolfgang Ahrendt 

C. Scott Ananian 
Stefan Andrei 
Thomas Arts 
Martin Berger 
Manuel Clrakravarty 
Byeong-Mo Chang 
Rong-Guey Chang 
Chiyan Chen 
Chung-Kai Chen 
Gang Chen 
Woongsik Choi 
Tyng-Ruey Chuang 
Koen Claessen 
Florin Craciun 
Huimin Cui 

Xie Haibin 
Fritz Henglein 
Kolrei Honda 
Gwan-Hwang Hwang 
Chung- Wen Hwang 

D. Doligez 
Derek Dreyer 
Kai Engellrardt 
Hyunjun Eo 
Jacques Garrigue 
Martin Giese 



William Greenland 
Hai-Feng Guo 
Hwansoo Han 
Pao-Ann Hsiung 
Ming-Yu Hung 
Tatsushi Inagaki 
Wu Jiajun 
Jang-Wu Jo 
Hyun-Goo Kang 
Paul Kennedy 
Siau-Cheng Khoo 
Youil Kim 
Jaejin Lee 
Oukseh Lee 
James Leifer 
Young-Jia Lin 
Tao Liu 
Zlranglin Liu 
Tom Melham 
Francois Metayer 
Oege de Moor 
Andrzej Murawski 
Keisuke Nakano 
Huu Hai Nguyen 
Susumu Nishimura 
Jeff Polakow 
Corneliu Popeea 
Shengchao Qin 
Julian Ratlike 
Masaliiko Sakai 



Masataka Sassa 
Sean Seefried 
Sunae Seo 
Rui Shi 
K. Y. Shieli 
Yuan-Shin 

Donald Bruce Stewart 
Eijiro Sumii 
Josef Svenningsson 
Munehiro Takimoto 
Feng Tang 
C. L. Tang 
Akiliiko Tozawa 
Yih-Kuen Tsay 
Jerome Vouillon 
Ken Wakita 
Bow-Yaw Wang 
Jason Wu 
Dana N. Xu 
Hongseok Yang 
Wuu Yang 
Masahiro Yasugi 
Handong Ye 
Yi-Ping You 
Shoji Yuen 
Patrick Zadarnowski 
Naijun Zhan 
Xiaogang Zhang 
Dengping Zliu 



Sponsoring Institutions 

Asian Association for Foundation of Software (AAFS) 
Academia Sinica, Taiwan 




Table of Contents 



Invited Talk 

A CLP Approach to Modelling Systems 

Joxan Jaffar 1 

Session 1 

An Algebraic Approach to Bi-directional Updating 

Shin- Cheng Mu, Zhenjiang Hu, Masato Takeichi 2 

Network Fusion 

Pascal Fradet, Stephane Hong Tuan Ha 21 

Session 2 

Translation of Tree-Processing Programs into Stream-Processing 
Programs Based on Ordered Linear Type 

Koichi Kodama, Kohei Suenaga, Naoki Kobayashi 41 

An Implementation of Subtyping Among Regular Expression Types 

Kenny Zhuo Ming Lu, Martin Sulzmann 57 

An Implementation Scheme for XML Transformation Languages 
Through Derivation of Stream Processors 

Keisuke Nakano 74 

Session 3 

Detecting Software Defects in Telecom Applications Through 
Lightweight Static Analysis: A War Story 

Tobias Lindahl, Konstantinos Sagonas 91 

History Effects and Verification 

Christian Skalka, Scott Smith 107 

Controlled Declassification Based on Intransitive Noninterference 

Heiko Mantel, David Sands 129 

Session 4 

A Concurrent System of Multi-ported Processes with Causal 
Dependency 

Tatsuya Abe 146 




XII 



Table of Contents 



Concurrency Combinators for Declarative Synchronization 

Pawel T. Wojciechowski 163 

A Uniform Reduction Equivalence for Process Calculi 

Zining Cao 179 

Invited Talk 

Substructural Operational Semantics and Linear Destination-Passing 
Style 

Frank Pfenning 196 

Session 5 

PType System: A Featherweight Parallelizability Detector 

Dana Na Xu, Siau- Cheng Khoo, Zhenjiang Hu 197 

A Type Theory for Krivine-Style Evaluation and Compilation 

Kwanghoon Choi, Atsushi Ohori 213 

Region-Based Memory Management for a Dynamically-Typed Language 

Akihito Nagata, Naoki Kobayashi, Akinori Yonezawa 229 

Session 6 

Protocol Specialization 

Matthias Neubauer, Peter Thiemann 246 

Automatic Generation of Editors for Higher-Order Data Structures 
Peter Achten, Marko van Eekelen, Rinus Plasmeijer, Arjen van 
Weelden 262 

A MATLAB-Based Code Generator for Sparse Matrix Computations 

Hideyuki Kawabata, Mutsumi Suzuki, Toshiaki Kitamura 280 

Session 7 

D-Fusion: A Distinctive Fusion Calculus 

Michele Boreale, Maria Grazia Buscemi, Ugo Montanari 296 

A Functional Language for Logarithmic Space 

Peter M0ller Neergaard 311 

Build, Augment and Destroy, Universally 

Neil Ghani, Tarmo Uustalu, Varmo Vene 327 

Free U-Monoids: A Higher-Order Syntax with Metavariables 

Makoto Hamana 348 




Table of Contents XIII 

Invited Talk 

The Scala Experiment - Can We Provide Better Language Support for 
Component Systems? 

Martin Odersky 364 

Session 8 

Pointcuts as Functional Queries 

Michael Eichberg, Mira Mezini, Klaus Ostermann 366 

Formal Design and Verification of Real-Time Embedded Software 

Pao-Ann Hsiung, Shang-Wei Lin 382 

Session 9 

McJava A Design and Implementation of Java with Mixin-Types 

Tetsuo Kamina, Tetsuo Tamai 398 

A Relational Model for Object-Oriented Designs 

He Jifeng, Zhiming Liu, Xiaoshan Li, Shengchao Qin 415 

Exploiting Java Objects Behavior for Memory Management and 
Optimizations 

Zoe C.H. Yu, Francis C.M. Lau, Cho-Li Wang 437 



Author Index 



453 




A CLP Approach to Modelling Systems 



Joxan Jaffar* 

School of Computing, National University of Singapore, 

Republic of Singapore 117543 
j oxan@comp . nus . edu . sg 

We present a formal method for modelling the operational behavior of various 
kinds of systems of concurrent processes. A first objective is that the method be 
broadly applicable. A system can be described in terms of its processes written 
in a traditional syntax-based manner, or in some non-traditional form such as 
a timed automaton. The number of processes may be fixed, or parameterized, 
or, because of dynamic process creation, unbounded. The communication and 
synchronization between processes may be synchronous or not, and via shared 
variables or some form of channels. We may have a traditional interleaving of 
processes, or use a specific scheduling strategy. The observables modelled should 
not be restricted to just the values of the program variables, but possibly other 
attributes of the system such as its registers and cache, its clock and battery 
values, etc. An example application area which touches upon these characteristics 
is that of determining worst-case execution time. 

We choose to model a generic system S in the form of a CLP program P. The 
model-theoretic semantics of P shall characterize the “collecting” semantics of 
S, that is, those states that are observable. The proof-theoretic semantics of P, 
on the other hand, further characterize the “trace” semantics of S. An advantage 
of this CLP approach is that intricate details of the system can be captured in 
a familiar logical framework. 

We then present a specification language for an extensive class of system be- 
haviors. In addition to the traditional safety and liveness properties which specify 
the universality or eventuality of certain predicates on states, we introduce the no- 
tions of relative safety and relative progress. The former extends traditional safety 
assertions to accommodate non-behavioral properties such as symmetry, serializ- 
ability and commutativity between processes. The latter provides for specifying 
progress properties. Our specification method is not just for stating the property 
of interest, but also for the assertion of properties held at various program points. 

Finally, we present an inference method, based upon a notion of inductive 
tabling , for proving an assertion A. This method can use assertions that have al- 
ready been proven, use the assertion A itself, in a manner prescribed by induc- 
tion principles, and dynamically generate new assertions. All these properties are 
shown to be useful in preventing redundant computations, which then can lead 
to efficient proofs. Our proof method thus combines the search characteristic of 
model-checking and abstract interpretation, and methods of inductive assertions. 

We demonstrate a prototype implementation on some benchmark examples. 



* Joint work with Andrew Santosa and Razvan Voicu. 
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An Algebraic Approach to Bi-directional 

Updating 



Shin-Cheng Mu, Zhenjiang Hu, and Masato Takeichi 

Department of Information Engineering, 
University of Tokyo, 

7-3-1 Hongo, Bunkyo-ku, Tokyo 113, Japan 
{scm,hu, takeichi }@ipl . t .u-tokyo .ac.jp 



Abstract. In many occasions would one encounter the task of maintain- 
ing the consistency of two pieces of structured data that are related by 
some transform — synchronising bookmarks in different web browsers, 
the source and the view in an editor, or views in databases, to name a 
few. This paper proposes a formal model of such tasks, basing on a pro- 
gramming language allowing injective functions only. The programmer 
designs the transformation as if she is writing a functional program, while 
the synchronisation behaviour is automatically derived by algebraic rea- 
soning. The main advantage is being able to deal with duplication and 
structural changes. The result will be integrated to our structure XML 
editor in the Programmable Structured Document project. 



1 Introduction 

In many occasions would one encounter the task of maintaining consistency of 
two pieces of structured data that are related by some transform. In some XML 
editors, for example [3,15], a source XML document is transformed to a user- 
friendly, editable view through a transform defined by the document designer. 
The editing performed by the user on the view needs to be reflected back to 
the source document. Similar techniques can also be used to synchronise several 
bookmarks stored in formats of different browsers, to maintain invariance among 
widgets in an user interface, or to maintain the consistency of data and view in 
databases. 

As a canonical example, consider the XML document in Figure 1(a) repre- 
senting an article. When being displayed to the user, it might be converted to 
an HTML document as in Figure 1(b), with an additional table of contents. 
The conversion is defined by the document designer in some domain-specific 
programming language. We would then wish that when the user, for example, 
adds or deletes a section in (b), the original document in (a) be updated corre- 
spondingly. Further more, the changes should also trigger an update of the table 
of contents in (a). We may even wish that when an additional section title is 
added to the table of contents, a fresh, empty section will be added to the article 
bodies in both (a) and (b). All these are better done without too much effort, 
other than specifying the transform itself, from the document designer. 

W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 2-20, 2004. 

(c) Springer- Verlag Berlin Heidelberg 2004 
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View-updating [5, 7, 10, 14, 1] has been intensively studied in the database 
community. Recently, the problem of maintaining the consistency of two pieces 
of structured data was brought to our attention again by [12] and [11]. Though 
developed separately, their results turn out to be surprisingly similar, with two 
important features missing. Firstly, it was assumed that the transform is total 
and surjective, which ruled out those transforms that duplicate data. Secondly, 
structural changes, such as inserting to or deleting from a list or a tree, were not 
sufficiently dealt with. 



<article> 

<title>Program inversion 
</title> 



<html> 

<hl>Program inversion</hl> 



<section> 

<title>0ur first ef f ort</title> 

<p>. . ,</p> 

</section> 

<section> 

<title>0ur second ef f ort</title> 

<p>. . ,</p> 

</section> 

</ article> 



<olxli>0ur first effort</li> 
<li>0ur second effort</li> 
</ol> 



<div> 

<h3>0ur first effort</h3> 
<p> . . .</p></div> 

<div> 

<h3>0ur second effort</h3> 
<p> . . .</pX/div> 

</html> 



(a) 



(b) 



Fig. 1 . An XML article and its HTML view with a table of contents 



In this paper we will address these difficulties using a different approach in- 
spired by previous studies of program inversion [2,8]. We extend the injective 
functional language designed in [13], in which only injective functions are defin- 
able and therefore every program is trivially invertible. The document designer 
specifies the transform as if she were defining an injective function from the 
source to the view. A special operator for duplication specifies all element-wise 
dependency. To deal with inconsistencies resulting from editing, however, we 
define an alternative semantics, under which the behaviour of programs can be 
reasoned by algebraic rules. It will be a good application of program inversion 
[8] and algebraic reasoning, and the result will soon be integrated into our XML 
editor in the Programmable Structured Document project [15]. 

In Section 2 we give a brief introduction of the injective functional language in 
which the transforms are specified, and demonstrate the view-updating problem 
more concretely. An alternative semantics of the language is presented in Sec- 
tion 3, where we show, by algebraic reasoning, how to solve the view-updating 
problem. Section 4 shows some more useful transform, before we conclude in 
Section 5. 
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2 An Injective Language for Bi-directional Updating 

Assume that a relation X, specifying the relationship between the source and 
the view, is given. In [11], the updating behaviour of the editor is modelled 
by two functions getx S — » V and putx (S x V) — » S. The function getx 
transforms the source to the view. The function putx, on the other hand, returns 
an updated source. It needs both the edited view and the original source, because 
some information might have been thrown away. For example, if the source is 
a pair and getx simply extracts the first component, the second component 
is lost. The cached original source is also used for determining which value is 
changed by the user. A more symmetrical approach was taken in [12], where 
both functions take two arguments. The relation X is required to be bi-total 
(total and surjective), which implies that duplicating data, which would make 
the relation non-surjective, is not allowed. 

In this paper we will explore a different approach. We make getx S — > V and 
putx ■■ V — > S take one argument only, and the transform has got to be injective 
we shall lose no information in the source to view transform. A point-free lan- 
guage allowing only injective functions has been developed in [13] with this as one 
of the target applications. Duplication is an important primitive of the language. 

Restricting ourselves to injective functions may seem like a severe limitation, 
but this is not true. In [13], it was shown that for all possibly non-injective 
functions / :: A — » B, we can automatically derive an injective function f :: 
A — » (5, H) where H records book-keeping information necessary for inversion. 
The extra information can be hidden from the user (for example, by setting the 
CSS visibility if the output is HTML). In fact, one can always make a function 
injective by copying the input to the output, if duplication is allowed. Therefore, 
the key extension here is duplication, while switching to injective functions is 
merely a change of presentation rather than separating the original source 
and the edited view as two inputs to putx, we move the burden of information 
preserving to X . This change, however, allows putx itself to be simpler, while 
making it much easier to expose expose its properties, limitation, and possibly 
ways to overcome the limitation. 

In this section we will introduce the language, Inv with some examples, and 
review the view-updating problem in our context. Extensions to the language 
and its semantics to deal with the view-updating problem will be discussed in 
Section 3. Some readers may consider the use of a point-free language as “not 
practical”. We will postpone our defend to Section 5. 

2.1 Views 

The View datatype defines the basic types of data we deal with. 

View ::= Int \ String | () 

| ( View x View ) \ L View \ R View 
| List View \ Tree View 
List a [] | a : List a 
Tree a Node a ( List ( Tree a)) 
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The atomic types include integer, string, and unit, the type having only one 
value (). Composite types include pairs, sum (L View and R View), lists and 
rose trees. The (:) operator, forming lists, associates to the right. We also follow 
the common convention writing the list 1 : 2 : 3 : [] as [1, 2, 3]. More extensions 
dealing with editing will be discussed later. For XML processing we can think of 
XML documents as rose trees represented by the type Tree. This very simplified 
view omits features of XML which will be our future work. In fact, for the rest of 
this paper we will be mostly talking about lists, since the techniques can easily 
be generalised to trees. 

2.2 An Injective Language Inv 

The syntax of the language Inv is defined as below. We abuse the notation a bit 
by using Xy to denote the union of X and the set of variable names V. The * 
operator denotes “a possibly empty sequence of” . 

X ::= X “ | nil \ zero \ C 

| 6 | dup P | cmp B \ ini \ inr 
j X; A | id | A UI 
| A x A | assocr \ assocl \ swap 

I H{V: X v ) 

C ::= succ \ cons \ node 
B ::= < | < | > I > 

P ::= nil I zero I str Strinq I (S')*id 
S ::= | fst | snd 

The semantics of each Inv construct is given in Figure 2. A relation of type 
A — > B is a set of pairs whose first components have type A and second com- 
ponents type B, while a function 1 is one such that a value in A is mapped to 
at most one value in B. A function is injective if all values in B are mapped 
to at most one value in A as well. The semantics of every Inv program is an 
injective function from View to View. That is, the semantics function [] has 
type Inv — > View — > View. For example, nil is interpreted as a constant function 
always returning the empty list, while zero always returns zero. Their domain is 
restricted to the unit type, to preserve injectivity. 

The function id is the identity function, the unit of composition. The semi- 
colon (;) is overloaded both as functional composition and as an Inv construct. 
It is defined by (/; g) a = g (/ a). 

Union of functions is simply defined as set union. To avoid non-determinism, 
however, we require in / U g that / and g have disjoint domains. To ensure 
injectivity, we require that they have disjoint ranges as well. The domain of a 
function / :: A — > B , written domf , is the partial function (and a set) {(a, a) € 
A | 3b £ B :: (a, b) £ /}. The range of /, written ranf , is defined symmetrically. 
The product (/ x g) is a function taking a pair and applying / and g to the 
two components respectively. We make composition bind tighter than product. 
Therefore (/; g x h) means ((/; g) x h). 



1 For convenience, we refer to possibly partial functions when we say “functions” . 
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Inilj () 

I zero] () 
[s«cc] n 
[cons] (a, x ) 
[node] (a, x) 
[ini] a 
[inr] a 

M] a 



[] 

0 

n + 1 
a: x 

Node a x 
L a 
R a 
a 



[swap] (a, 6) =(b,a ) 

[assocr] ((a, 6), c) = (a, (6, c)) 



[associ] (a, (6, c)) = ((a, 6), c) 

[cmp <] (a, 6) = (a, b), if a < b 
[<5] a = (a, a) 

\f',9}x = Iff] (!/] *) 

If x ff] («. 6) = ([/] a . Iff] &) 

I/Uff] =[/]U[p], 

if dom f n dom g = ran / n ran g = 0 

IT] = 1/1° 

\mW = l F » F i 



Fig. 2. Functional semantics of Inv constructs apart from dup 



The fixed-point of F, a function from Inv expressions to Inv expressions, is 
denoted by p,F. We will be using the notation (X: expr ) to denote a function 
taking an argument X and returning expr. 

The converse of a relation R is defined by 

(6, a) € R° = (a, b) £ R 

The reverse operator r corresponds to converses on relations. Since all func- 
tions here are injective, their converses are functions too. The reverse of cons, 
for example, decomposes a non-empty list into the head and the tail. The reverse 
of nil matches only the empty list and maps it to the unit value. The reverse 
operator distributes into composition, products and union by the following rules, 
all implied by the semantics definition [/“] = [/]°: 

[(/; ff)l = Iff“l; IT] ITU = 1/1 

[(/ X ff)l = 10 r X 5“)] KfxFyi = Im(X: (F X“)“)] 

I(/Uff)l = l/lulffl 

The S operator is worth our attention. It generates an extra copy of its 
argument. Written as a set comprehension, we have 6a = {(n, (n,n)) \ n € A}, 
where A is the type 6 gets instantiated to. We restrict A to atomic types (integers, 
strings, and unit) only, and from now on use variable n and m to denote values 
of atomic types. To duplicate a list, we can always use map 6; unzip , where map 
and unzip are to be introduced in the sections to come. Taking its reverse, we 
get: 

6 a " = {((n,n),n) | n £ A} 

That is, 6" takes a pair and lets it go through only if the two components 
are equal. That explains the observation in [8] that to “undo” a duplication, we 
have to perform an equality test. 

In many occasions we may want to duplicate not all but some sub-component 
of the input. For convenience, we include another Inv construct dup which takes 
a sequence of “labels” and duplicates the selected sub-component. The label is 
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either fst, snd , cons"' , and node" . Informally, think of the sequence of labels as 
the composition of selector functions (fst and snd) or deconstructors, and dup 
can be understood as: 

[dup f\x = (x,lfj x) 

If we invert it, (dup f)" becomes a partial function taking a pair (x,n), and re- 
turns x unchanged iifx equals n. The second component n can be safely dropped 
because we know its value already. We write ( dupf )" as eqf. For example, 
dup (fst’ snd) ((a, n), b) yields (((o, n), 6), n), while eq (fst; snd) (((a, n), b), m) 
returns (( a,n ), b) if n = m. Formally, dup is defined as a function taking a list 
of function names and returns a function: 

dup id =6 

dup (fst- P) = (dup P x id)-, subl 
dup (snd- P) = (id x dup P); assocl 
dup (cons"-, P) = cons"-, dup P; (cons x id) 
dup (node"-, P) = node"; dup P; (node x id) 

Here, [s«6Z] ((o, 6), c) = (( a,c),b ), whose formal definition is given in 
Section 2.3. 

Another functionality of dup is to introduce constants. The original input is 
kept unchanged but paired with a new constant: 

[dup nil ] a = (a, []) 

[dup zero ] a = (a, 0) 

[dup (str s)] a = (a, s) 

Their reverses eliminates a constant whose value is known. In both directions 
we lose no information. 

The cmp construct takes a pair of values, and let them go through only if 
they satisfy one of the five binary predicates given by non-terminal B. 

2.3 Programming Examples in Inv 

All functions that move around the components in a pair can be defined in 
terms of products, assocr, assocl, and swap. We find the following functions 
useful: 

subr = assocl; (swap x id); assocr 
subl = assocr; (id x swap); assocl 
trans = assocr; (id x subr); assocl 

Their semantics, after expanding the definition, is given below: 

l subr\(a,(b,c )) =(b,(a,c)) 

[■ subl]((a,b),c ) =((a,c),b) 

I trans ] ((a, b), (c, d)) = ((a, c), (b, d)) 
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mktoc = denode article; cons "; (hi x cont); cons ; ennode html 
hi = denode title; ennode hi 
cont = extract ; ( enlist x body); cons 

extract = map (denode section; cons''; (denode title x id); dup fst; swap); unzip 

enlist = map (ennode li); ennode ol 

body = map ((ennode h3 x id); cons; ennode div) 



denode s = node"; swap; eq (str s) 
ennode s = (denode s)“ 

Fig. 3. An Inv program performing the transform from Figure 1(a) to Figure 1(b). 
String constants are written in typewriter font 



Many list-processing functions can be defined recursively on the list. The 
function map applies a function to all elements of a list; the function unzip takes 
a list of pairs and splits it into a pair of lists. They can be defined by: 

mapf = /z(X : nil"; nil U 

cons"; (f x X); cons) 
unzip = n(X : nil"; 8; (nil x nil) U 

cons"; (id x X); trans; ( cons x cons)) 

This is what one would expect when we write down their usual definition in 
a point-free style. The branches starting with nil" are the base cases, matching 
empty lists, while cons" matches non-empty lists. It is also provable from the 
semantics that (mapf)" = mapf". 

The function merge takes a pair of sorted lists and merges them into one. 
However, by doing so we lose information necessary to split them back to the orig- 
inal pair. Therefore, we tag the elements in the merged list with labels indicating 
where they were from. For example, merge ([1, 4, 7], [2, 5, 6]) = [L 1, R 2, L4, R 5, 
R6,L7]. It can be defined in Inv as below: 

merge = g(X : eq nil; map ini U 

swap; eq nil; map inr U 
(cons" x cons"); trans; 

((leq x id); assocr; (id x subr; (id x cons); X); (ini x id) U 
((gt; swap) x id); assocr; (id x assocl; ( cons x id); X); (inr x id)); 
cons ) 

where leq = cmp (<) and gt = cmp (>). 

As a final example, the program in Figure 3 performs the transform from 
Figure 1(a) to Figure 1(b). It demonstrates the use of map, unzip and dup. For 
brevity, the suffixing id in dup (fst; id) will be omitted. 

2.4 The View-Updating Problem 

Now consider the scenario of an editor, where a source document is transformed, 
via an Inv program, to a view editable by the user. Consider the transform 
toe = map (dup fst); unzip, we have: 
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toe [(1, “a”), (2, (3, “c”)] = ([(1, “a”), (2, “6”), (3, V’)], [1, 2, 3]) 

Think of each pair as a section and the numbers as their titles, the function 
toe is a simplified version of the generation of a table of contents, thus the name. 

Through a special interface, there are several things the user can do: change 
the value of a node, insert a new node, or delete a node. Assume that the user 
changes the value 3 in the “table of contents” to 4: 

([(1, “a”), (2, “6”), (3, “c”)], [1,2,4]) 

Now we try to perform the transformation backwards. Applying the reverse 
operator to toe, we get ( map ( dupfst)\ unzip)" = unzip"' map ( eqfst ). Applying 
it to the modified view, unzip " maps the modified view to: 

[((1, “«”),1),((2, “&”),2),((3, “c”), 4)] 

pairing the sections and the titles together, to be processed by map (eqfst). 
However, ((3, “c”),4) is not in the domain of eqfst because the equality check 
fails. We wish that eqfst would return (4, “c”) in this case, answering the user’s 
wish to change the section title. 

Now assume that the user inserts a new section title in the table of contents: 

([(1,V’),(2,“&”),(3, “c”)],[l, 2,4,3]) 

This time the changed view cannot even pass unzip" , because the two lists 
have different lengths. We wish that unzip " would somehow know that the two 
3’s should go together and the zipped list should be 

[((1, “a”),l),((2, “6”), 2), (_L, 4), ((3, V’),3)] 

where _L denotes some unconstrained value, which would be further constrained 
by map ( dupfst ) to (4,_L). The Inv construct eqfst should also recognise _L and 
deal with it accordingly. 

In short, we allow the programmer to write Inv transforms that are not sur- 
jective. Therefore it is very likely that a view modified by the user may fall 
out of the range of the transform. This is in contrast of the approach taken in 
[12] and [11]. The two problems we discussed just now are representative of the 
view-updating problem. There are basically two kinds of dependency we have 
to deal with: element-wise dependency, stating that two pieces of primary-typed 
data have the same value, and structural dependency, stating that two pieces of 
data have the same shape. 

One possible solution is to provide an alternative semantics that extends the 
ranges of Inv constructs in a reasonable way, so that the unmodified, or barely 
modified programs can deal with the changes. We will discuss this in detail in 
the next section. 




10 



S.-C. Mu, Z. Hu, and M. Takeichi 



3 Alternative Semantics 

We will need some labels in the view, indicating “this part has been modified by 
the user.” We extend the View data type as described below: 

View | *Int | * String 

List a ... | a ® List a \ a 0 List a 

Here the * mark applies to atomic types only, indicating that the value has 
been changed. The view a(Bx denotes a list a: x whose head a was freshly inserted 
by the user, while a 0 x denotes a list x which used to have a head a but was 
deleted. The deleted value a is still cached for future use. The two operators 
associate to the right, like the cons operator (:). A similar set of operators can 
be introduced for Tree but they are out of the scope of this paper. 

The original semantics of each Inv program is an injective function. When 
the tags are involved, however, we lost the injectivity. Multiple views may be 
mapped to the same source. For example, the value 1 is mapped to (1, 1) by 8. 
In the reverse direction, (n,*l) and (*l,n), for all numerical n, are all mapped 
to 1. Similarly, all these views are mapped back to [1,2,3] when the transform 
is map succ: [2, 3, 4], o0 [2,3,4], 2 : a© [3,4], 2© [3,4], 2 : 3$ [4] for all a. 

We define two auxiliary functions notag! and ridtag. The former is a partial 
function letting through the input view unchanged if it contains no tags. The 
latter gets rid of the tags in a view, producing a normal form. Their definitions are 
trivial and omitted. The behaviour of the editor, on the other hand, is specified 
using two functions getx and putxi both parameterised by an Inv program X: 

get x = notag!; [A] 
put x C [A “] ; ridtag 

The function getx maps the source to the view by calling A. The function 
putx , on the other hand, maps the (possibly edited) view back to the document 
by letting it go though A“ and removing the tags in the result. Here C denotes 
“functional refinement”, defined by fCg if and only if / C g and domf = 
dom g. In general [A"]; ridtag is not a function since [A'“] may leave some values 
unspecified. However, any functional refinement of [A“]; ridtag would satisfy the 
properties we want. The implementation can therefore, for example, choose an 
“initial value” for each unspecified value according to its type. The initial view 
is obtained by a call to getx • When the user performs some editing, the editor 
applies putx to the view, obtaining a new source, before generating a new view 
by calling getx again. 

In the original semantics of Inv, the “ operator is simply relational converse. 
In the extended semantics, the u operator deviates from relational converse for 
three constructs: 8, cons and sync, to be introduced later. For other cases we still 
have [/“] = |/]°. The distributivity rules of ” given in Section 2.2 are still true. 

In the next few sub-sections we will introduce extensions to the original se- 
mantics in the running text. A summary of the resulting semantics will be given 
in the end of Section 3.2. 
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3.1 Generalised Equality Test 

The simple semantics of 6 a, where A is an atomic type, is given by the set 
{((n, n), n) \ n € A}. To deal with editing, we generalise its semantics to: 

[5"] ( n , n) = n |<5"] (m, m) = *n 

[5"] (m, m) = m [6"] (m, m) = n 

When the two values are not the same but one of them was edited by the user, 
the edited one gets precedence and goes through. Therefore (m, m) is mapped 
to m. If both values are edited, however, they still have to be the same. Note 
that the semantics of 6 does not change. Also, we are still restricted to atomic 
types. One will have to call map 6 ; unzip to duplicate a list, thereby separate the 
value and structural dependency. 

The syntax of dup can be extended to allow, a possibly non-injective func- 
tion. The results of the non-injective function, and those derive from them, are 
supposed to be non-editable. It is a useful functionality but we will not go into 
its details. 

3.2 Insertion and Deletion 

Recall unzip defined in Section 2.3. Its reverse, according to the distributivity of 
v , is given by: 

unzip " = p,(X : ( nil" x nil")] 6"\ nil U 

(cons" x cons")' trans ; (id x A); cons) 

The puzzle is: how to make it work correctly with the presence of 0 and 0 
tags? We introduce several new additional operators and types: 

— two new Inv operators, del and ins, both parameterised by a view. The 
function del a :: [A] — > [A] introduces an (a 0 ) tag, while ins a :: [A] — » [A] 
introduces an (a 0 ) tag. 

— two kinds of pairs in View: positive (a, b) + and negative (a, b)~. They are 
merely pairs with an additional label. They can be introduced only by the 
reverse of fst ± and snd± functions to be introduced below. The intention 
is to use them to denote pairs whose components are temporary left there 
for some reason. 

— six families of functions fst° and snd°, where □ can be either +, — , or 
nothing, defined by 

fst° (a, b) a = a 
snd° (a, 6) D = b 

That is, fst£ eliminates the second component of a positive pair only if it 
equals b. Otherwise it fails. Similarly, snd a eliminates the first component 
of an ordinary pair only of it equals a. When interacting with existing 
operators, they should satisfy the algebraic rules in Figure 4. In order to 
shorten the presentation, we use □ to match +, — and nothing, while ± 
matches only + and — . The □ and ± in the same rule must match the 
same symbol. 
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With the new operators and types, an extended unzip capable of dealing 
with deletion can be extended from the original unzip by (here denotes the 
original two branches of unzip): 

unzip " = n(X : . . . Vo, b- 

((ins aY x (ins by ); X; ins (a, b) U 
((ins a)^ x isList ); X; ins (a, b) U 
(isList x (ins by); X; ins (a, b) U 
((del a)^ x (del &)“); X; del (a, b) U 
((del a)^ x cons"; snd~ b ); X; del (a, b) U 
(cons"; snd" a x (del &)“); X; del (a, b )) 

where a and b are universally quantified, and isList = nil"; nil U cons"; cons, a 
subset of id letting through only lists having no tag at the head. 

Look at the branch starting with ((ins a)" x (ins b)"). It says that, given a 
pair of lists both starting with insertion tags a © and 6 © , we should deconstruct 
them, pass the tails of the lists to the recursive call, and put back an ((a, b) © ) 
tag. If only the first of them is tagged (matching the branch starting with 
((ins a)" x isList)), we temporarily remove the a © tag, recursively process 
the lists, and put back a tag ((a, b) © ) with a freshly generated b. The choice of 
b is non-deterministic and might be further constrained when unzip is further 
composed with other relations. The situation is similar with deletion. In the 
branch starting with (del a x snd b °; cons) where we encounter a list with an a 
deleted by the user, we remove an element in the other list and remember its 
value in b. Here universally quantified b is used to match the value — all the 
branches with different 6’s are unioned together, with only one of them resulting 
in a successful match. 

It would be very tedious if the programmer had to explicitly write down 
these extra branches for all functions. Luckily, these additional branches can be 
derived automatically using the rules in Figure 4. In the derivations later we will 
omit the semantics function [] and use the same notation for the language and 
its semantics, where no confusion would occur. This is merely for the sake of 
brevity. 

In place of ordinary cons, we define two constructs addressing the dependency 
of structures. Firstly, the bold cons is defined by:: 

cons = cons U (J a .. A (snd~ a ; del a) U U a .. A (snd+; ins a) 

(f x g);fst° gb) = fst b ;/, if g total assocl; (fst° x id) = (id x snd b ) 

(f x g); snd°j a ) = snd°; g, if / total assocl; (snd° x id) = (snd° U snd a ) 

swap; snd° = fst° assocl; snd^^ = snd°; (snd b U snd b ) 

snda"; nil = ( A [] — > a) 

Fig. 4. Algebraic rules. Here (A [] — > a) is a function mapping only empty list to a. Only 
rules for assocl are listed. The rules for assocr can be obtained by pre-composing assocr 
to both sides and use asscor; assocl = id. Free identifiers are universally quantified 
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Secondly, we define the following sync operator: 

sync = ( cons x cons ) 
sync ' = (cons'' x cons'') 

u U a .beA((( del a Y', snd Y x (del b ) “ ; sndl “) 

U ((del a)'] snd ~ a ' x cons''] sndb ; sndl') 

U (cons'] snd a ] snd~ a ' x (del b)'] snd~ b ')) 

u Ua,6e^i(((* ns o)“; x (* ns 

U ((ins a)“; snd+' x isList] snd b ') 

U (isList] snd b ' x (ins &)“; snd b ')) 

In the definition of unzip, we replace every singular occurence of cons with 
cons, and every (cons x cons) with sync. The definition of sync' looks very 
complicated but we will shortly see its use in the derivation. Basically every 
product corresponds to one case we want to deal with: when both the lists are 
cons lists, when one or both of them has a © tag, or when one or both of them 
has a © tag. 

After the substitution, all the branches can be derived by algebraic reasoning. 
The rules we need are listed in Figure 4. To derive the first branch for insertion, 
for example, we reason: 

unzip' 

© {fixed-point} 

sync'] trans] (id x unzip)] cons 

© {since sync' © ((ins a)'] snd+' x (ins b)'] snd b ') for all a, b} 

((ins a)' x (ins b)'); (snd+' x (ins b)')] trans] (id x unzip)] cons 
© {claim: (snd+' x snd b ')] trans D (snd^ a ^)“} 

((ins a)' x (ins b)')] (snd^ a b ^)' ; (id x unzip) ]c ons 
= {since (/ x </); snd^ a = snd+]g for total /} 

((ins a)' x (ins b)')] unzip] (snd^ a b ^)'] cons 
© {since cons D snd* a b y ins (a, b)j 

((ins a)' x (ins b)')] unzip] (snd^ a b )Y\ sndf a b y ins (a, b) 

= {since snd+'] snd+ = id} 

((ins a)' x (ins b)')] unzip] ins (a, b) 

We get the first branch. The claim that trans'] (snd° x snd b ) = snd° a ^ can 
be verified by the rules in Figure 4 and is left as an exercise. The introduction 
of two kinds of pairs was to avoid the suffix being reduced to (del (a, b))' in the 
last two steps. To derive one of the branches for deletion, on the other hand, 
one uses the inclusion sync' D ((del a)'] snd~ a ' x cons'] sndb] sndl') f° r the first 
step, and cons D snd) a b y, del (a, b) and (snd) a b ^)'] snd( a ^ = id fort the last 
step. All the branches can be derived in a similar fashion. 
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MO = [] 


VI 

o 


a, b) n = (a, b) a , if a < b 


{zero} () = 0 


[/; 5 ] x 


= [.<?] ([/] z) 


[succ] n — n + 1 


[/xj](fl,fi) D = ([/] a, [ 0 ] b) D 


[cons] (a, x) — a: x 


[/M 


= [/] u [ 5 ], 


[node] (a, x) = Node a x 


if dom / n dom g = ran / D ran 


\inl\ a = L a 


M] 


= [FtiFj 


[mr] a — R a 






[id] a = a 


IT] 


= [/F 




[/;p 1 


= [$“]; VI 


[swap] (a, b) D =(b,a)° 


[(/ x 9)1 


= [(/“ x <?“)] 


[assocr] ((a.fe)*, c) ± = 




= VI u [V] 


[assocr] ((«,&)*, c) =(a,(6,c) ± ) 


[Ml 


= V(X ^ (F X")"j 



[assocr] ((a,6),c) ± =(a,(6,c)) ± 
assocl = assocr " 

crr = / 

[5] n = (n, n) 

[5“](n, n) D =n 
[<5“] (*n, *n) D = *n 
[<5“] (*n, m)° = *n 
[<5“] (m, *n)° = *n 

[ dup nil ] a = (a, []) 

\(dupnil)"\(a, []) D =a 

[dnp zero\ a = (a,0) 

[(dnp «erO)“] (a, 0) D =a 

[dnp(strs)]a = (a, s) 

l(dup ( str s))“] (a, s) D = a 

cons = cons 

u \J a ,, A ( snd ^ del a ) 

U U a::A( Snd a'’ inS a ) 



lfsa(a,b)° =b 
[snd°] (a, b) n = a 
[ del a] (a © x) = (a, x)~ 

[ins a] (a © x) = (a, x) + 

dup id =6 

dup ( fst ; P) = ( dup P x id); subl 

dup ( snd ; P) = (id x dup P); assocl 
dup (cons"; P) = cons"; dup P; (cons x id) 
dup (node"; P) = node"; dup P; (node x id) 

sync = (cons x cons) 
sync " = (cons" x cons") 
u U a ,beA((( del a ')' J ’ snd ^ x ( delb)";sndl “) 
U ((del a)"; snd~" x cons"; snd/,; sndl") 
U (cons"; snd a ; snd~ a " x (del b)"; sndl")) 
u Ua,66^i(((* ns a )“; snd+* x (ins b)";snd 6 + “) 
U((ins a)"; snda" x isList; snd£") 

U (isList; snd x (ins b)"; snd£")) 



Fig. 5. Summary of the alternative semantics. The patterns should be matched from 
the top-left to bottom-left, then top-right to bottom-right 



3.3 The Put-Get-Put Property and Galois Connection 

A valid Inv program is one that does not use fst° and snd° apart from in cons 
and sync. The domain of getx, for a valid A, is restricted to tag-free views, 
so is its range. In fact, notag?; [A] reduces to the injective function defined 
by the original semantics. Therefore, getx; getx° = domgetx- Furthermore, 
notag?; ridtag = notag?. As a result, for all valid Inv programs A we have the 
following get-put property: 



getx; putx = dom getx 



(1) 
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This is a desired property for our editor: mapping an unedited view back to 
the source always gives us the same source document. 

On the other hand, putx', getx C id is not true. For example, ( puts', gets) (*a, b) 
= {a, a) ^ (*a, b). This is one of the main differences between our work and that 
of [12] and [11]. They both assume the relation X to be bi-total, and that the 
put-get property putx ; getx = id holds. It also implies that duplication cannot 
be allowed in the language. 

Instead, we have a weaker property. First of all, for all valid X we have 
dom getx Q ran putx ■ That is, every valid source input to getx must be a result 
of putx for at least one view, namely, the view the source get mapped to under 
the original semantics. Pre-composing put X to (1) and use putx ; dom getx Q 
putx i ran putx = putx , we get the following put-get-put property: 

put x ; get x i putx C putx (2) 

When the user edits the view, the editor calls the function put X to calculate 
an updated source, and then calls getx to update the view as well. For example, 
(*a, 6) is changed to (a, a) after puts', gets- With the put-get-put property we 
know that another putx is not necessary, because it is not going to change the 
view — the result of putx', getx', putx, if anything, is the same as that of putx- 

It is desirable to have putx', getx', putx = putx - However, this is not true, and 
dom getx 7^ ran putx- For a counter-example, take X = (6 x id); assocr; {id x 
6). The function getx takes only pairs with equal components and returns it 
unchanged. Applying putx to (*&, a) results in (b, a), which is not in the domain 
of getx- Such a result is theoretically not satisfactory, but does not cause a 
problem for our application. The editor can signal an error to the user, saying 
that such a modification is not allowed, when the new source is not in the domain 
of getx- The domain check is not an extra burden since we have to call getx 
anyway. 

A Galois connection is a pair of functions / :: A — » B and g :: B — > A 
satisfying 

/a - < y = x di g y (3) 

Galois connected functions satisfy a number of properties, including f;g;f = 
f. For those X that dom getx = ranput-x do hold, getx and putx satisfy (3), if 
we take A to be equality on tag-free Views and < to be {put-x; getx) ■ That is, 
s A s 1 if and only if the two sources s and s' are exactly the same, while a view 
v is no bigger than v' under < if there exists a source s such that v = getx s and 
s = putv'. For example, {n,n) is no bigger than {m,m), {m,*n), {*n,*n), and 
(n, n) itself under <, when the transform is S. The only glitch here is that < is 
not reflexive! In fact it is reflexive only in the range of get-x — the set of tag-free 
views. However, this is enough for getx and putx to satisfy most properties of 
a Galois connection. 

3.4 Implementation Issues 

In our experimental implementation, we have a simple interpreter for Inv. One 
way to incorporate the algebraic rules in the previous section in the imple- 
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mentation is to call a pre-processor before the program is interpreted. Another 
possibility is to build the rules implicitly in the interpreter. In this section we 
will talk about how. 

The abstract syntax tree of Inv is extended with new constructs cons and 
sync. The “intermediate” functions introduced in the last section, namely ins, del, 
fst±s and snd ± s, are not actually represented in the abstract syntax. Instead, 
we extend the value domain View with additional constructs: 

View ::=... | ( View, + View) | (+ View, View) 

| {View, -View) \ {-View, View) 
j -L | NilTo View 

Conceptually, after we apply snd+“ to a value b, we get {+a, b), while {-a, b) 
is the result of applying snd'“ to b. The reader can think of them as a note 
saying “the value should have been b only, but we temporarily pair it with an a, 
just to allow the computation to carry on.” Or one can think of it as a pending 
application of snd+ or snd" a . The J_ symbol denotes an unconstrained value. 
Finally, NilTo a denotes a function taking only [] and returns a. 

To implement the sync operator, we add the following definitions (some cases 
are omitted): 

[■ sync “] {a: x, b : y) = {{a, x), {b, y)) \sync"\ {a ®x,y) = {{+a, x), (+_L, y)) 
[■ sync “] {a 0 x, b: y) = {{-a, x), {-b, y)) {sync ] {x, b © y) = ((+_L, x), {+b, y)) 
{sync^j {a: x,bQy) = {{-a, x), {-b, y)) 

The first clause is simply what {consxcons)° would do. The second clause 
shows that when there is a deletion in the first list, we throw away an element 
in the second list as well, while keeping note of the fact by the _) tag. It cor- 
responds to the {del a°; snd~ a ° x cons°; snd' b ; snd~ b °) branch of {consxcons) . The 
fourth branch, on the other hand, corresponds to {{ins a)°; snd+° x isList\ snd b °). 
The newly introduced, unconstrained value b is represented by _L. 

Now we build in some extra rules for cons and cons'": 

[cons] {-a, x) = aQ x [cons“] (o0i) = {-a, x) 

[cons] (+o, x) = a 0 x [cons“] (a®i) = (+a, x) 

They correspond to the fact that snoT; cons = del and snd a “; cons = ins a. 
Also, some additional rules for assocr: 

[assocr] {{a, +b), c) = {a, {+b, c)) [assocr] {+{a, b),c) = (+a, {+b, c)) 

[assocr] ((+a, b), c) = (+a, {b, c)) 

The three clauses correspond to the rules for assocl in the left column of 
Figure 4. Finally we need some rules for dup nil and its inverse eqnil: 

\{eq nil)] {-a, []) = NilTo a \{dup nil)] {NilTo a) = {-a, []) 

which corresponds to the rule snd°“; eqnil = (A [] — > a) in Figure 4. 

4 More Examples 



In this section we will show more transforms defined in Inv that do satisfy 
dom getx = ran putx and how they react to user editing. 
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4.1 Snoc and List Reversal 

The function snoc :: (a, List a) — > List a, appending an element to the end of a 
list, can be defined recursively as: 

snoc = p.(X : eq nil ; dup nil ; cons U 

(id x cons°); sw&r; (id x X); cons) 

For example [snoc] (4, [1, 2, 3]) = [1,2, 3,4]. Conversely, snoc " extracts the 
last element of a list. But what is the result of extracting the last element of a 
list whose last element was just removed? We expand the base case: 
snoc " 

0 {fixed-point} 
cons“; eq nil ; dr«p nil 
0 {specialising cons 0 snd~ a ; deZ a} 

(deZ a)“; snd" a "; ea nil; dup nil 
= {since snd~ a "; eg niZ = (A[[ — > a)} 

( del a)"; (A[] — > a); dup nil 

= {since snd“ eg niZ = (A[] — > a) => snd' “ = (A[] — + a); dup nil} 

(del a)"; snd“ “ 

That is, for example, eval snoc " (4 0 []) = (-4, []). Inductively, we have eval 
snoc " (1 : 2 : 3 : 4 0 []) = (-4,1 : 2 : 3 : []), which is reasonable enough: by 
extracting the last element of a list whose last element, 4, is missing, we get a 
pair whose first element should not have been there. 

The ubiquitous fold function on lists can be defined by 

fold f g = n(X : nil"; g U cons ";(id x X);f) 

The function reverse, reverting a list, can be defined in terms of fold as 
reverse = fold snoc nil. Unfolding its definition, we can perform the following 
refinement: 

reverse “ 

D {unfolding the definitions} 
snoc';(id x reverse “); cons 
0 {by the reasoning above, snoc “ D (del a) w ; snd" a “} 

(del a)“; snd“ (id x reverse"); cons 
= {since (/ x g); sndj a = snd~ a ; g for total /} 

(del a)"; reverse"; snd~ a "; cons 
0 {since cons 0 snd" a ; del a and snd" a "; snd~ a = id} 

(del a)"; reverse"; del a 

which shows that reverse" regenerates the 0 tags (and, similiarly, 0 tags) 
upon receipt of the “partial” pairs returned by snoc. For example, we have 
eval reverse (1 : 2 : 304: []) = 4:302:1: [] which is exactly what we want. 
A lesson is that to deal with lists, we have to first learn to deal with pairs. 
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4.2 Merging and Filtering 

Recall the function merge defined in Section 2.3, merging two sorted lists into 
one, while marking the elements with labels remembering where they were from: 

merge ([1,4,7], [2,5,6]) = [L 1, R 2, L4, R 5, R 6, L7] 

Filtering is an often needed feature. For example, in a list of ( author , article) 
pairs we may want to extract the articles by a chosen author. The Haskell Prelude 
function filter :: ( a — > Bool) — > List a — > List a , returning only the elements in 
the list satisfying a given predicate, however, is not injective because it throws 
away some items. A common scenario of filtering is when we have a list of 
sorted items to filter. For example, the articles in the database may be sorted 
by the date of creation, and splitting the list retains the order. If we simplify 
the situation a bit further, it is exactly the converse of what merge does, if we 
think of L and R as true and false! 

To make merge work with editing tags, we simply replace every occurrence 
of cons with cons, including the cons in ( cons x cons). This time the latter shall 
not be replaced by sync because we certainly do not want to delete or invent 
elements in one list when the user edits the other! This merge does behave as 
what we would expect. For example, when an element is added to the split list: 

merge (1 : 3 © 4 : 7 : [], [2, 5, 6]) = L 1 : R 2 : L3 © [LA, R 5, R 6, L7] 
the new element is inserted back to the original list as well. 

5 Conclusion 

Bi-directional updating, though an old problem [5, 7, 10, 14, 1], has recently at- 
tracted much interests, each took a slightly different approach according to their 
target application. We have developed a formalisation of bi-directional updating 
which is able to deal with duplication and structural changes like insertion and 
deletion. From a specification X, written as an injective function, we induce 
two functions getx and putx that satisfy the important get-put and put-get-put 
properties. To find out how putx reacts to user editing, one can make use of 
algebraic reasoning, which also provides a hint how the formalisation can be 
implemented in an interpreter. 

Our formalisation deals with duplication and structural changes at the cost of 
introducing editing tags, which is okay for our application — to integrate it to our 
structural editor in [15]. The complementary approach taken by [11], on the other 
hand, chooses not to use any information how the new view was constructed. 
Upon encountering inconsistency, the system generates several possible ways to 
resolve the inconsistency for the user to choose from. It would be interesting to 
see whether there is a general framework covering both approaches. 

Another feature of our work is the use of an injective language, and vari- 
ous program derivation and inversion techniques. The injective language Inv has 
been introduced in [13], where it is also described how to automatically derive 
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an injective variant for every non-injective program. So far we have a primi- 
tive implementation. For an efficient implementation, however, the techniques 
described in [9] based on parsing may be of help. 

The history of point-free functional languages many be traced to Backus’s FP 
[4] , although our style in this paper is more influenced by [6] . A number of well- 
established libraries adopt a point-free style, such as the XML processing library 
HaXml [16]. Furthermore, there is a tedious, uninteresting way of converting a 
certain class of pointwise functional programs into Inv. The class of programs is 
essentially the same as the source language in [9], that is, first-order functional 
programs with linear patterns in case expressions, where every variable is used 
at least once. 
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Abstract. Modular programming enjoys many well-known advantages 
but the composition of modular units may also lead to inefficient pro- 
grams. In this paper, we propose an invasive composition method which 
strives to reconcile modularity and efficiency. Our technique, network 
fusion, automatically merges networks of interacting components into 
equivalent sequential programs. We provide the user with an expres- 
sive language to specify scheduling constraints which can be taken into 
account during network fusion. Fusion allows to replace internal commu- 
nications by assignments and alleviates most time overhead. We present 
our approach in a generic and unified framework based on labeled tran- 
sition systems, static analysis and transformation techniques. 



1 Introduction 

Modular programming enjoys many well-known advantages: readability, main- 
tainability, separate development and compilation. However, the composition of 
modular units (components) gives rise to efficiency issues. Sequential compo- 
sition poses space problems: the producer delivers its complete output before 
the consumer starts. Parallel composition relies on threads, synchronization and 
context switches which introduce time overhead. 

In this paper, we propose an invasive composition method, network fusion, 
which strives to reconcile modularity and efficiency. Our technique automati- 
cally merges networks of interacting components into equivalent sequential pro- 
grams. Our approach takes two source inputs: a network of components and 
user-defined scheduling constraints. Networks are formalized as Kahn Process 
Networks (Kpns) [7] a simple formal model expressive enough to specify compo- 
nent programming and assembly. Scheduling constraints allow the user to choose 
the scheduling strategy by specifying a set of desired executions. The operational 
semantics of Kpns and scheduling constraints are both formalized as guarded 
labeled transition systems (Lts). 
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Network fusion is an automatic process which takes a Kpn and scheduling 
constraints and yields a sequential program respecting the constraints. Note 
that constraints may introduce artificial deadlocks, in which case the user will 
be warned. The resulting program must be functionally equivalent to the Kpn 
modulo the possible deadlocks introduced by constraints. Fusion alleviates most 
time overhead by allowing the suppression of context switches, the replacement 
of internal communications by assignments to local variables and optimizations 
of the resulting sequential code using standard compiling techniques. Network 
fusion can be seen as a generalization of filter fusion [12] to general networks 
using ideas from aspect-oriented programming [8] (scheduling constraints can be 
seen as an aspect and their enforcement as weaving). 

The four main steps of the fusion process are represented in Figure 1. 




Fig. 1. Main steps of network fusion 



— The first step is the abstraction of the network into a finite model called 
an Abstract Execution Graph (Aeg). An Aeg over-approximates the set of 
possible executions traces. We do not present this step in details since it 
relies on very standard analysis techniques ( e.g abstract interpretation) 
and many different abstractions are possible depending on the desired level 
of precision. Instead, we focus on the properties that an Aeg must satisfy. 

— The second step consists in enforcing constraints. This is expressed as a 
synchronized product between guarded Lts (the Aeg and the constraints). 
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In general, this step does not sequentialize completely the execution and 
leaves scheduling choices. 

— The third step completes the scheduling of the constrained Aeg. Several 
strategies can be used as long as they are fair. Again, these strategies can 
be expressed as guarded Lts and scheduling as a synchronized product. 

— The fourth step, concretization, maps the scheduled (serialized) Aeg to a 
single sequential program. Further transformations ( e.g ., standard optimiza- 
tions) can then be carried out on the resulting program. 

We have chosen to present fusion in an intuitive and mostly informal way. In 
particular, we do not provide any correctness proofs. They would require a com- 
plete description of the operational semantics of Kpn too long to fit space limits. 
This paper is organized as follows. Section 2 presents the syntax and semantics of 
Kpns. Section 3 describes Aegs and defines the abstraction and concretization 
steps which both relate Aeg to concrete models (programs and Kpns). Section 
4 presents the language of constraints and the two main transformation steps of 
fusion: constraints enforcement and scheduling. We propose three extensions of 
the basic technique in Section 5 and, finally, we review related work and conclude 
in Section 6. 



2 Networks 

We start by providing the syntax of components and networks. We just out- 
line their semantics and provide some intuition using an example. A complete 
structural operational semantics for Kpns can be found in [6]. 

2.1 Basic Components 

Components are made of commands c of the form: 

h : 9 I l 2 

where l\ and 1 2 denote labels, g a guard and a an action. An action is either a 
read operation on an input channel fix, a write operation on an output channel 
f\x, or an internal action i (left unspecified). A component (or process) p is a 
set of commands {ci, . . . , c n }. If the current program point of a component p is 
l\, if li : g | a 1 2 is a command of p and the guard g is true, then the action a 

can be executed and the program point becomes £ 2 - 

The components we consider in this paper represent valid, sequential and 
deterministic programs. They have the following restrictions: 

— A component has a unique entry point denoted by the label Iq . 

— All the labels used in p are defined in the Ihs of commands. 

— Two commands with the same label have mutually exclusive guards. 

The program P in Figure 2 sends the set N in increasing order on channel /. 
Program C assigns a with the value read on channel / if a < b or assigns b with 
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Fig. 2. A Simple Kpn and its trace semantics 



the value read on the channel t otherwise. Then, it sends o + b on the channel o 
and loops. Note that guards are omitted when they are true. 

The semantics of a component p is expressed as a Lts (E p , (l 0 , sq),£ p , — > p ) 
where: 

— U p is an (infinite) set of states (l, s ) where l a label and s a store mapping 
variables to their values. 

— (lo,so) is the initial state made of the initial label Iq and store so- We as- 
sume that the initial label is always indexed by 0 and that the initial store 
initializes integer variables by the value 0, 

— £ p is the set of commands of p, 

— — > p is the transition relation (actually, a function) on states labeled with 
the current command. 

The initial labels of programs P and C (Figure 2) are po and co respectively 
and the variables x , a and b are initialized to 0. In the remaining, we use c\ g and 
c\ a to denote the guard and the action of the command c respectively. To simplify 
the presentation, we consider only non-terminating programs. Termination could 
always be represented by a final looping command such as l en d '■ skip l en d- 
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2.2 Networks of Components 

A Kpn k is made of a set of processes {p \ , . . . , p n } executed concurrently. Net- 
works are build by connecting output channels to input channels of components. 
Such channels are called internal channels whereas the remaining (unconnected) 
channels are the input and output channels of the network. The communication 
on internal channels is asynchronous (non blocking writes, blocking reads) and is 
modeled using unbounded fifos. In order to guarantee a deterministic behavior, 
Kpns require the following conditions [7]: 

— An internal channel is written by and read from exactly one process. 

— An input channel is read from exactly one component (and written by none) . 

— An output channel is written by exactly one component (and read from 
none) . 

— A component cannot test the absence of values on channels. 

In order to simplify technical developments, we assume that networks have a 
single input and output channels denoted by t and o respectively and that the 
input channel never remains empty. 

The global execution state of a Kpn is called a configuration. It is made 
of the local state of each component and the internal channel states i.e., finite 
sequences of values iq : . . . : v n : c. 

The operational semantics of Kpn is expressed as a Lts (Zfc, ao,£fc, — >k) 
where: 

— Afc is a (infinite) set of configurations, 

— the initial configuration ag is such that each component is in its initial state 
and each internal channel is empty, 

— £k is the union of the sets of commands of components; these sets are sup- 
posed disjoint, 

— the transition relation — >k is defined as performing (non deterministically) 
any enabled command of any process. A command is enabled when the cur- 
rent program point is its Ihs label, its guard is true in the current configura- 
tion/state and it is not a blocking read (i.e., a read on an empty channel). 

The transition relation gives rise to an infinite graph representing all the 
possible execution traces. A small part of the transition relation — > p for our 
example is depicted in Figure 2. Here, no global deadlock is possible and all 
traces are infinite. 

An infinite execution trace is said to be fair if any enabled action at any point 
in the trace is eventually executed. The denotational semantics of a Kpn is given 
by the function from the input values (the input channel) to the output values 
(the output channel) generated by fair executions. We will write Traces (fc) and 
IO(k) to denote the set of traces and the denotational semantics of the Kpn k 
respectively. Kpns of deterministic components are deterministic [7]. Also, all 
fair executions with the same input yield the same output [6]. An important 
corollary for us is that Kpns are serializable: they can always be implemented 
sequentially. 
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3 Abstract Execution Graphs 

Network fusion necessitates to find statically a safe and sequential scheduling. 
This step relies upon an abstract execution graph (Aeg), a finite model upper- 
approximating all the possible executions of the Kpn. We present in this section 
the key properties than an Aeg should satisfy and present an example. 

An Aeg k ** is a, finite Lts (£ k ,a\,£ k , — > k ) with: 

— E k a finite set of abstract configurations, 

— aj is the initial abstract configuration, 

— £ k a (finite) set of commands, 

— — > k a labeled transition relation. 

The idea behind abstraction is to summarize in an abstract configuration a 
(potentially infinite) set of concrete configurations [10]. This set is given by the 
function cone : S k — » V(S k ) defined as: 

conc(a^) = {a \ a ss a 

where ss is a safety relation relating k and 0 (and we write k « 0). 

There can be many possible abstractions according to their size and accuracy. 
Network fusion is generic w.r.t. abstraction as long as the Aeg respect two key 
properties: safety and faithfulness. To be safe, the initial abstract configuration of 
an Aeg must safely approximate the initial concrete configuration. Furthermore, 
if a configuration a\ is safely approximated by a\ and the network evolves in 
the configuration CC 2 , then there exists a transition from a\ to a\ in the Aeg 
such that «2 is safely approximated by a\. These two points ensure that any 
execution trace of the Kpn is safely simulated by one in the Aeg. Formally: 

Definition 1 (Safety). Let k « 0 , then 0 is a safe approximation of k iff 

_ tj 
«0 ~ OL q 

d , c .-.a « . a c d 

a\ « a\ A a.\ — > k Oii => d Oi\ . a2 ~ a) A a\ — > k a 2 

A key property of safe abstractions is that they preserve fairness. Of course, 
since they are upper approximations they include false paths (abstract traces 
whose concretization is empty). However, for abstract traces representing feasible 
concrete traces, fair abstract traces represent fair concrete traces. Safety also 
implies that all fair concrete traces are represented by fair abstract traces. 

An Aeg is said to be faithful if each abstract transition corresponds to a 
concrete transition modulo the non-satisfiability of guards or a blocking read. 
In other words, faithfulness confines approximations to values. A false path can 
only be an abstract trace with a transition whose concrete image would be a 
transition with a false guard or a blocking read. Formally: 

Definition 2 (Faithfulness). Let k « 0 , then 0 is a faithful approximation 
°f k iff 

a{ —^0 k a 2 A cci « a{ =>■ 3a2-a2 « a\ A ai — 0 k a ,2 

v ^£[ c M q i 

V (c| 0 = P-x A ati[f >—> e]) 
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Faithfulness rules out, for instance, the (highly imprecise but safe) abstrac- 
tion made of a unique abstract state representing all concrete states. In practice, 
the same abstract state cannot represent different program points (label config- 
urations) . 

In order to provide some intuition we give here a crude but straightforward 
abstraction: 

— Each process state is abstracted into the program point it is associated to. 
So, variables are not taken into account and process stores are completely 
abstracted away, 

— Each internal channel state is represented by an interval approximating the 
length of its file. 




Fig. 3. Example of an Aeg 



It is the control flow graph of the Kpn where each node holds a collection of 
intervals approximating the lengths of internal channels at the configuration of 
program points the node represents. The Aeg for our running example is given 
in Figure 3. In this particular example, the state of / is always approximated by 
the interval [0,+oo[ (the most imprecise information). More precise Aegs could 
be designed (see Section 5.2). 

An Aeg bears enough information to be translated back into a program. 
Commands (guards and actions) label edges and nodes represent labels. The 
concretization of finite Lts 0 = (S k ,a.g,£ k , — > k ) into a program is formal- 
ized by the following straightforward translation: 

Concretization(k^) = {l a : l a \ a\ -^> k } 
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An important property of safe and faithful abstractions is that their con- 
cretization has the same semantics as the network they approximate. 

Property 1. If k ^ is a safe and faithful approximation of k then Traces (k) = 
Traces (Concretization(k^)) 

4 Fusion 

The user can specify scheduling constraints defining a subset of execution traces. 
Constraints impose implementation choices; they serve to guide and to optimize 
the fusion process. Constraints respect the black box nature of components. 
They are expressed w.r.t. 10 operations, liveness properties or sizes of files. 
When constraints completely sequentialize the execution (no choice remains), 
they specify a scheduler. In general, however, constraints are incomplete and 
leave implementation choices. 

4.1 Scheduling Constraints 

We specify constraints by finite state Lts labeled with guarded actions. Of 
course, a more user-friendly language for declaring constraints should be stud- 
ied but this is not the purpose of this article. The formalism used in Sections 
2 and 3 is also well-suited to expressing constraints. We enrich the language of 
guards with two additional constructs dedicated to the expression of scheduling 
strategies: 

g c ::= / 0 k | | g where 0 is any comparison operator 

The size of a channel can be compared against an integer. For instance, / < 5 
is true if the file / has less than 5 elements. The guard Q3 P is true if the process 
p is blocked (by a read on an empty channel or by other scheduling constraints) . 

Constraints are more easily specified using sets of actions. We use the follow- 
ing notations: 

A::=*\[f]'!\[f]\\^A\A 1 nA 2 \A p 

where 

— * represents any action of the network, 

— ? (resp. !) represents any read (resp. write) and /? (resp. /!) any read (resp. 
write) on file /, 

— ~^A is the complementary set of A, 

— Ai D A 2 is the intersection of the sets A\ and A 2 , 

— A p represents the projection of the set of actions A onto the commands of 
the component p. 

For instance, ( _l ?) p represents all non-read actions of component p. Using 
sets is just more convenient; constraints can be automatically translated into a 
standard Lts labeled by standard commands afterward. 
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Figure 4 gathers a few examples of constraints for a network with (at least) 
two components P (writing a file /) and C (reading the file /) . 

— The constraint Aff summarizes in a small automaton the strategy used by 
Filter Fusion [12]. The producer P starts until it writes on /? The control is 
passed to the consumer C until it reads / and so on. This strategy bounds 
the size of the fifo / to be at most 1 and therefore it may introduce artificial 
deadlocks from some networks. Aff sequentializes completely the execution 
of P and C (no scheduling choice remains). 

— The constraint Ajs is similar to Aff except that both P and C can be 
executed between writes and reads on /. Ajs leaves some scheduling choices. 

— The constraint Afb is a generalization of Aff to a file / with k places ( i.e ., 
P writes k times before the control is passed to C). This is the formalization 
of the extension of filter fusion proposed in [3] 

— A demand driven strategy is specified by Add- The consumer C is executed 
until it blocks i.e., is about to read the empty channel /. Then, P is executed 
until it produces a value in /. The control is passed to C which immediately 
reads / and continues. 

These constraints can be applied to any network as long as it has two com- 
ponents P and C connected at least with a channel /. Of course, constraints can 
be specified for any number of components and channels. 
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4.2 Enforcing Constraints 

Enforcing a constraint A = (£\, Ao, £\, — >a) to an Abstract Execution Graph 
0 = (S k ,a\,£ k , — ) can be expressed as a parallel composition (fc# || A). 
This operation can be defined formally as follows. We assume that all shorthands 
(like ( _, ?)p) used in constraints are replaced by the actions of the Aeg they 
represent. 



|| A = (E k x Ex, (a$, Xo),£ k , — >k\) 

with 

a» ~^ k c/ a* -^ fc c/ a € S k \S X 

(a#, A) 9 ^} a kX (a#', A') («“> A ) ^ (a#', A) 

If an action a is taken into account by the constraints, the execution can pro- 
ceed only if both Lts can execute a (i.e., they can both execute commands made 
of a and a true guard). The actions not taken into account by the constraints 
can be executed independently whenever possible. Constraints do not introduce 
new actions {£\ C £ k ). To simplify the presentation, we assumed in the above 
inference rules that the guards did not use the condition 03 p . We now present 
the rule corresponding to this condition in isolation. 

The 03 p construct serves to pass the control to another component when 
one is blocked. The condition < B p is easily defined w.r.t. Kpns: p is blocked in 
configuration a if there is no outgoing transition labeled with a command of 
p. However, Aegs are approximations with false paths; a component p can be 
blocked even if the corresponding abstract state has outgoing transitions labeled 
with commands of p. Actually, p is blocked in an abstract state if any outgoing 
p transition has either a false guard or is a read on an empty channel (i.e., is not 
enabled). Formally, let C\ , ... ,c n all the commands of p such that a® cnj 

and n = I Ci ^ V 7 = 0 if Ci l“= f ?x 

1 1 -, ( c i|cf) otherwise 

then the necessary and sufficient condition for p to be blocked in a® is 



M a# ) = A 9i 



The product of an Aeg with a transition guarded by 03 p is defined as follows: 

it' \ ® 1° 

A — 



u q\a 

a N — > k a " 



A' 



(otK A) 



gAb (a )|o uf 

— >fcA (or , A ) 



Figure 5 represents the product of the Aeg of Figure 3 with App. The com- 
ponent P is executed until it produces a value on / then C is executed until it 
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Fig. 5. Fusion with Aff 



reads a value on /. Note that if a > b remains always true then P will never be 
executed. So, the execution is not fair but it is nevertheless correct and yields 
the same output as the network (P is never executed only when its production 
is not needed). The strategy does not use guards, so no new test appears in the 
constrained Aeg. The result is completely sequentialized. 

After the constrained Aeg is produced, the size of hies is reestimated using 
standard static analysis techniques. We have indicated in Figure 5 the new ap- 
proximations for /. They show that the Aeg is now bounded (the size of / is at 
most 1). 

It is easy to check that the Aeg can be translated by Concretization (see 
Section. 3) into a sequential program. As already mentioned, one goal of fusion 
is to suppress internal communications. For unbounded Aeg, internal reads and 
writes are replaced by assignments to lists or hfos. Here, the channel / can be 
implemented by a single variable Vf and writes fix and reads f?a by assignments 
Vf := x and x := Vf. These assignments can then be suppressed using standard 
optimization techniques [1]. Finally, after a renaming of labels, we get: 



PC = 



pco : x := x + 1 
pci : a < b | a := 
< pci : a> b | ilb 
PC 2 : x := x + 1 
PC 3 : ola + b 



pci; ' 

x pc 2 ; 
^pc 3 ; > 
^pc 3 ; 



We have presented the parallel composition as a fairly standard automata 
product. Depending on the size of the Lts, this may cause an unacceptable 
state explosion. We present a solution to this problem in Section 5.3. 
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4.3 Scheduling 

In general, constraints enforcement leaves implementation choices which must be 
taken to produce a sequential program. The fusion process makes these choices 
automatically by scheduling the execution of components. A valid schedule must 
be fair (all enabled components must be eventually executed) and sequential (the 
scheduled execution must correspond to a sequential program). 

We choose here a simple and fair policy: round-robin scheduling. Components 
are ordered in a circular queue and the scheduler activates them in turn. Either 
the current active component is blocked (by a read or a user defined constraint) 
either one of its command is executed. In both cases, the control is passed to 
the next component. Figure 6 formalizes round-robin for networks with two 
components P and C as a guarded Lts. It would be easy to generalize such a 
round-robin Lts for any network with a fixed number of processes. 




Fig. 6. Round-Robin Scheduling 



The schedule is fair and ensures a complete serialization of the execution. It 
starts by enforcing the execution of one instruction of P, then one instruction of 
C and so on. If one of the two processes is blocked at its turn, then an instruction 
of the other process is executed instead. When both processes are blocked then 
it is a global deadlock denoted by the special instruction deadlock. 

Constrained Aegs are composed in parallel with the automaton of Figure 
6 to obtain sequential programs. The composition is the same as before except 
that the deadlock action does not belong to the set of actions of components. 
The product will therefore introduce a new deadlock transition along with a 
new state in the Aeg. This new transition, which detects a global deadlock, 
will be implemented by printing an error message and terminating the program. 
When such a transition appears in the result of fusion, the user is warned of a 
possibility of deadlock. 

Let us consider the scheduling of the original Aeg of Figure 3. This situation 
would arise if the user does not provide any constraint. The Aeg obtained after 
product (and simplifications) is given in Figure 7. Simplifications are needed 
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Fig. 7. Sequentialization with Round-Robin 

since the product of transitions guarded by 03 x produces many dummy transi- 
tions ( i.e with false guards). 

The process P is never blocked (it never reads), so the execution can start 
by x := x + 1. The execution must proceed by C if it is not blocked (^03c). 
There are two cases: either a > b and C is not blocked and its action (t?6) 
can be executed, either a < b and C is blocked by a read of the empty file /. 
In the latter case, round-robin scheduling passes the control to P and executes 
the action fix. The transition a < b \ fix corresponds to “if C is blocked then 
execute the next P’s command” . We do not describe any further the product 
which proceeds similarly. Contrary to the product with App, the result is fair but 
unbounded: the data produced by P may accumulate in the channel / without 
bounds. 

The correctness of the scheduling process comes from the fact that the prod- 
uct with App yields a sequential, fair and faithful Aeg. Note that network fusion 
is generic w.r.t. the scheduling strategy. More sophisticated policies ( e.g ., using 
several queues, based on static or dynamic priorities, etc.) could be considered 
as well. As in Section 4.2, we have presented scheduling as an Lts product; 
scheduling could also be implemented by the technique outlined in Section 5.3. 

4.4 Semantic Issues 

User-defined constraints can change the semantics of the Kpn. For example, 
a constraint which bounds communication channels would cause an artificial 
deadlock into an unbounded Kpn. The user may want to enforce properties even 
at the price of deadlocks. We consider that a change of semantics is acceptable 
as long as it depends on the user and remains under her or his control. However, 
this requires to restrict the class of acceptable constraints. Consider the following 
constraint: 
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A? 



where a and b are two non-mutually exclusive commands and A\ and A 2 repre- 
sents distinct constraints. Since a and b can be executed indifferently, a choice 
will be made by scheduling. However, depending on this choice, the constraints 
that are enforced afterwards {A\ or A 2 ) are different. For example, A\ may imply 
an artificial deadlock in some (non statically determined) cases and A 2 in some 
other cases. In other words, the semantics of the resulting program will depend 
on a blind choice made by fusion. This semantic change is not acceptable since 
it would be out of the control of the user. 

Our solution is to restrict the class of acceptable constraints. Namely, if a 
constraint leaves a non deterministic choice such as a or b above then the con- 
straints must ensure that all processes can still evolve in the same way (an 
artificial deadlock in one side implies that we have the same artificial deadlock 
on the other side). Each choice a or b correspond to set (language) of acceptable 
traces (a.£(Ai) and 6.£(A 2 ))- A constraint is acceptable if for each choice, the 
projection of the corresponding languages to the commands of any process are 
equivalent. For the above example we must enforce that 

Vp. a.C(Ai) ip = b.C(A 2 ) | p 

With this condition, the choices made by the scheduling step do not have any 
semantic impact. 

All the constraints of Figure 4 are acceptable. It is obvious for App, App, App 
since they do not leave any non deterministic choice. In ff/s, the two transitions 
labeled by “i/!(~|-i/? leave the choice between executing P or C. However, in both 
states, they lead to the same state (and therefore accept the same language). 



5 Extensions 

The preceding sections have presented the main ideas of network fusion. We hint 
here at three ways of extending the basic technique: providing more linguistic 
support to the user, working on more precise abstractions, avoiding products 
between Lts. These three extensions all aim at getting more efficient fused pro- 
grams. 

5.1 Linguistic Support 

Scheduling constraints allow users to control network fusion. Other linguistic 
support could be provided to users as well. We focus here on special commands 
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allowing to alleviate the false path problem. False paths arise when data de- 
pending controls are abstracted by non deterministic choices [2]. This standard 
approximation makes fusion consider infeasible paths and spurious deadlocks. 



Synchronization using data 
values 



Synchronisation using linguistic 
extensions 



l 0 : 4? AT 


h; 


h : ct\N 


h\ 


I 2 • i '.= 0 


h\ 


h : i < N 


i = i + 1 li', 


I 3 : i > N 


skip lo\ 


U : dt\x 


I 3 


0 

6 

-S’ 


h\ 


h : ctlM 


h', 


h :j := 0 


h', 


l 3 :j< M 


3=3 + 1 1+ 


I 3 ■ j > M 


skip io; 


U : dtty 


Is', 


h ■ o\y 


l 3 



P = 



it N 


li; 


ct\N 


h; 


i := 0 


h', 


i<N\i — i+1 


W, 


i > N | wait(dt) 


lo; 


dt\x 


l 3 


C = 


0 ! 0 


^ h; 


^waiting?(dt) | dt?y 


^ I 3 ; 


waiting?(dt) | proceed(dt) 


lo; 


o\y 


li 



Fig. 8. A false path problem (left) and its solution (right) 



The left part of figure 8 shows a simple but characteristic example of the 
problem. The process P begins by sending on channel ct the number of items 
it will produce on channel (It. Then, P and C respectively writes and reads the 
same number of items on dt {Ad = N). However, this information is lost in the 
Aeg which abstracts away values. The fusion process must therefore consider 
the case where P produces not enough values and C is blocked and also the case 
where P produces unconsumed values and the size of dt cannot be bounded. 

This problem can be alleviated using commands making synchronization or 
termination explicit. The languages of actions and guards are extended with the 
following constructs: 



a ::= wait(/) | proceed(/) | . . . 
g ::= waiting? (/) | . . . 

— The commands wait(/) and proceed(/) permit to express a rendezvous 
between the producer and the consumer of a file /. The producer blocks on 
wait(/) until the consumer emits proceed(/). 

— The predicate waiting?(/) evaluates to true if the producer is waiting on 
wait(/) and all data has been consumed on /. It evaluates to false if there is 
some available data. It blocks if there is no available data and the producer 
is not waiting. 
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These instructions are just syntactic sugar and waiting?(/) do not affect 
the determinism of Kpns. They could be implemented by writing/reading a 
special value on an additional channel. On the other hand, they do provide more 
information to the fusion process and permit to avoid false paths. 

The right part of Figure 8 shows how to take profit of these instructions on the 
previous example. Instead of communicating via ct the number of items written 
on dt , P finishes its emission by waiting to C . The consumer reads until it has 
consumed all data produced by P; it then releases P and both processes may 
proceed. Such explicit rendezvous can be taken into account by the abstraction 
step to avoid the problematic false paths mentioned above. 

Others instructions could be considered as well. For example, an instruction 
close (/), indicating that a process will not write or read on / anymore, would 
also be useful. 

5.2 More Precise Abstractions 

In section 2, we presented an abstraction representing the control flow graph of 
the Kpn. This abstraction gives a very imprecise approximation for the size of 
file / at each state ([0;+oo[). As long as they respect the safety and faithful- 
ness properties, many other abstractions could be used. We present here a new 
abstraction aimed at finding bounded schedules when they exist. A bounded 
schedule ensures that the size of fifo files remain bounded throughout the execu- 
tion. In this case, fifo can implemented (after fusion) by local variables (instead 
of dynamically allocated data structures). Furthermore, when the precise size of 
a channel is known there is no need to test for its emptiness before reading it. 

In some contexts, such as embedded systems, it is crucial to find bounded 
schedules and lot of work has been devoted to this issue ([11], [5], [13]). In the 
context of Petri Nets, Cortadella et al. presented a new criterion which limits the 
search state for schedules [5] . They conjectured that if a bounded schedule exists 
then it will be found in the delimited search space. We can adapt the criterion to 
our context to produce abstractions suited to the discovery of bounded schedule. 

The idea is to have precise abstract states associating an integer to each size 
(not an interval anymore). The same set of control points ( e.g (po,Ci)) may 
appear several times in the Aeg with different sizes of files. The finiteness of the 
Aeg is ensured by the irrelevance criterion. A state S 2 is said irrelevant if the 
Aeg contains another state si such that: 

— S 2 is reachable from si, 

— all fifo have their size in S 2 greater or equal than their size in si, 

— each fifo whose size is greater in S 2 than in si has a non-zero size in si. 

The idea behind this criterion is that a irrelevant state cannot enable any 
new action (e.g., it does not enable a blocked read). It is no use to continue 
unfolding the graph. It can be closed using a state where the size of channels are 
approximated by [0; +oo[. A small part of the Aeg obtained using the irrelevance 
criterion on our running example is shown on Figure 9. A bounded schedule can 
be found in the part shown. This improved precision allows to find bounded 
schedules automatically; it also involves larger Aegs. 
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Fig. 9. Aeg with irrelevance criterion (excerpt) 



5.3 Instrumented Product 

Two steps of network fusion are described as a synchronized product between 
guarded Lts. Obviously, in some cases, this could cause a state explosion and pro- 
duce too large programs. A solution to avoid this space problem is to implement 
the product by instrumenting the Aeg. The Lts representing the constraints or 
the schedule is taken into account by the Aeg by introducing a variable (to rep- 
resent the state of the Lts) and new instructions (to represent state transitions). 

We have used such a technique in [4] to enforce safety properties (expressed 
as finite state automata) on programs. We have shown that the instrumentation 
can be made very efficient using simple techniques (specialization, minimization 
and reachability analysis). This instrumented product introduces at worst an 
assignment (a state transition) at each if and while command. 

This technique is easily extended to guarded Lts. Figure 10 represents the 
result of the instrumented product between the Aeg of Figure 3 and App. It 
has the same number of states as the original Aeg. On the other hand, instruc- 
tions (l := {0, 1}) and tests (l = {0, 1}) have been inserted to encode the state 
transitions of App. Compared to the Lts of Figure 5 which represents the stan- 
dard synchronized product, some states like (Ao,po>co) and (AijPoAo) are now 
merged into a single state (po,co)- Transitions from this state must now test 
whether underlying Lts App is in the state A 0 or Ai. 

On this example, the smaller number of states is certainly not worth the 
overhead. In general, however, instrumented product is at most linear in size 
whereas synchronized product may entail a quadratic blowup. A small time 
overhead is preferable to a space explosion. In any case, the user should be given 
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Fig. 10. Instrumented product with Aff 



the opportunity to specify on which Lts (or on which parts of a Lts) using 
standard or instrumented product. 



6 Conclusion 

The inspiration and motivation for this work came from two main sources: 

— Filter fusion [12], a simple algorithm to merge a producer connected by a sin- 
gle channel to a consumer. Filter fusion is restricted to very specific networks 
(pipelines) and to a fixed strategy. Our work can be seen as a formalization 
of filter fusion using synchronized product and as well as a generalization to 
arbitrary networks and user-defined strategies. The application of our tech- 
nique on pipelined filters with the constraint Aff (Figure 4) is equivalent to 
filter fusion. The extension of filter fusion with a more sophisticated schedul- 
ing strategy proposed in [3] is formalized in our framework by the scheduling 
constraints Afb (Figure 4). 

— Aspect-Oriented Programming (AOP) [8] whose goal is to isolate aspects 
( e.g security, synchronization, etc.) that cross-cut the basic functional- 
ity of the program. Our scheduling constraints can be seen as a schedul- 
ing/synchronization aspect and their enforcement as aspect weaving. As in 
[4] , we consider aspects specified as temporal formulas on the trace semantics 
of programs. In network fusion, aspects express scheduling and synchroniza- 
tion choices by filtering unwanted (or selecting desired) execution traces. 
This restricted and formal view permits to describe and control precisely 
the semantic impact of weaving (usually, a very difficult task in AOP). 
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Some functional program transformations bear similarities with network fu- 
sion. As fusion aims at removing values produced on channels by the composition 
of components, Listlessness [15] and deforestation [16] aim at removing the in- 
termediate data structures produced by the composition of functions. As filter 
fusion, these transformations consider producer-consumer pairs and have a fixed 
fusion strategy. 

The area of embedded/reactive systems has produced a large body of work on 
static scheduling. Lin [9] studies the static scheduling of synchronously commu- 
nicating processes. Cortadella et al. [5] and Strehl et al. [13] consider scheduling 
of asynchronous process networks. They all use petri nets as their underlying 
formalism. Like Parks [11], they focus on bounded scheduling and do not con- 
sider user-defined constraints even if some integrate a form of fusion. Strehl et 
al. [14] propose a design model that permits the specification of components and 
scheduling constraints. They derive a scheduler but do not consider fusion. 

We have presented a generic and flexible framework for merging networks 
of interacting components. It is based on guarded labeled transition systems, 
synchronized product, static analysis and transformation techniques. Fusion can 
be applied to a large class of networks (Kpns) and can take into account user- 
defined scheduling constraints. The technique can be parameterized by different 
abstractions, constraints and scheduling strategies. Still, a lot of work remains 
to be done. 

The formalization and the correctness proofs should be completed. A proto- 
type needs to be implemented in order to validate the approach experimentally. 
We expect that large programs can be abstracted into small automata since 
fusion focuses on I/O instructions (blocks of internal instructions can be rep- 
resented by a single action). Along with the use of instrumented product in 
problematic cases, we are confident that efficient and reasonable sized programs 
can be produced. 

More generally we see network fusion as part of a more general framework 
to assemble and fuse components. A first feature of such a framework would 
consist in an architecture description language to specify the assembly (i.e., the 
ports and their connections). Another useful feature would be the ability to 
specify the synchronization instructions. They do not have to be 10 instructions 
as supposed previously. By considering some actions of two components as 10 
operations on a (conceptual) channel f (i.e., fix and fix), it becomes possible 
to impose constraints on their interleaving. 
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Abstract. There are two ways to write a program for manipulating 
tree-structured data such as XML documents and S-expressions: One is 
to write a tree-processing program focusing on the logical structure of 
the data and the other is to write a stream-processing program focus- 
ing on the physical structure. While tree-processing programs are easier 
to write than stream-processing programs, tree-processing programs are 
less efficient in memory usage since they use trees as intermediate data. 
Our aim is to establish a method for automatically translating a tree- 
processing program to a stream-processing one in order to take the best 
of both worlds. We define a programming language for processing binary 
trees and a type system based on ordered linear type, and show that every 
well-typed program can be translated to an equivalent stream-processing 
program. 



1 Introduction 

There are two ways to write a program for manipulating tree-structured data 
such as XML documents [3] and S-expressions: One is to write a tree-processing 
program focusing on the logical structure of the data and the other is to write 
a stream-processing program focusing on the physical structure. For example, 
as for XML processing, DOM (Document Object Mode) API and programming 
language XDuce [5] are used for tree-processing, while SAX (Simple API for 
XML) is for stream-processing. 

Figure 1 illustrates what tree-processing and stream-processing programs 
look like for the case of binary trees. The tree-processing program / takes a 
binary tree t as an input, and performs case analysis on t. If t is a leaf, it incre- 
ments the value of the leaf. If t is a branch, / recursively processes the left and 
right subtrees. If actual tree data are represented as a sequence of tokens (as is 
often the case for XML documents) , / must be combined with the function parse 
for parsing the input sequence, and the function unparse for unparsing the re- 
sult tree into the output sequence, as shown in the figure. The stream-processing 
program g directly reads/ writes data from/to streams. It reads an element from 

W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 41-56, 2004. 

(c) Springer- Verlag Berlin Heidelberg 2004 
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Tree-processing program /: 

fix /.At. (case t of leaf x =>• leaf (x + 1 ) | node xi X2 =>■ node (/ *1) (/ *2)) 
Stream-processing program: g 

fix g. At. (case read() of leaf => leaf (read() + 1) 
node => write(node); g ()-g ()) 



Fig. 1 . Tree-processing and stream-processing 



the input stream using the read primitive and performs case-analysis on the 
element. If the input is the leaf tag, g outputs leaf to the output stream with 
the write primitive, reads another element, adds 1 to it, and outputs it. If the 
input is the node tag, g outputs node to the output stream and recursively 
calls the function g twice with the argument (). 

Both of the approaches explained above have advantages and disadvantages. 
Tree-processing programs are written based on the logical structure of data, so 
that it is easier to write, read, and manipulate (e.g. apply program transfor- 
mation like deforestation [14]) than stream-processing programs. On the other 
hand, stream-processing programs have their own advantage that intermediate 
tree structures are not needed, so that they often run faster than the correspond- 
ing tree-processing programs if input/output trees are physically represented as 
streams, as in the case of XML. 

The goal of the present paper is to achieve the best of both approaches, by 
allowing a programmer to write a tree-processing program and automatically 
translating the program to an equivalent stream-processing program. To clarify 
the essence, we use a A-calculus with primitives on binary trees, and show how 
the translation works. 

The key observation is that: (1) stream processing is most effective when trees 
are traversed and constructed from left to right in the depth-first manner and (2) 
in that case, we can obtain from the tree-processing program the corresponding 
stream-processing program simply by replacing case analysis on an input tree 
with case analysis on input tokens, and replacing tree constructions with stream 
outputs. In fact, the stream-processing program in Figure 1, which satisfies the 
above criterion, is obtained from the tree-processing program in that way. 

In order to check that a program satisfies the criterion, we use the idea of 
ordered linear types [11,12]. Ordered linear types, which are an extension of 
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Tree-processing program: 

fix /.At. (case t of leaf x => leaf x \ node x\ *2 => node (/ * 2 ) (/ * 1 )) 



Fig. 2. A program that swaps children of every node 



linear types [2,13], describe not only how often but also in which order data 
are used. Our type system designed based on the ordered linear types guaran- 
tees that a well-typed program traverses and constructs trees from left to right 
and in the depth-first order. Thus, every well-typed program can be translated 
to an equivalent stream-processing program. The tree-processing program / in 
Figure 1 is well-typed in our type system, so that it can automatically be trans- 
lated to the stream-processing program g. On the other hand, the program in 
Figure 2 is not well- typed in our type system since it accesses the right sub-tree 
of an input before accessing the left sub-tree. In fact, we would obtain a wrong 
stream-processing program if we simply apply the above-mentioned translation 
to the program in Figure 2. 

The rest of the paper is organized as follows: To clarify the essence, we first 
focus on a minimal calculus in Section 2-4. In Section 2, we define the source 
language and the target language of the translation. We define a type system 
of the source language in Section 3. Section 4 presents a translation algorithm, 
shows its correctness and discuss the improvement gained by the translation. The 
minimal caclulus is not so expressive; especially, one can only write a program 
that does not store input/output trees on memory at all. (Strictly speaking, one 
can still store some information about trees by encoding it into lambda-terms.) 
Section 5 describes several extensions to recover the expressive power. With the 
extensions, one can write a program that selectively buffers input/output trees 
on memory, while the type system guarantees that the buffering is correctly 
performed. After discussing related work in Section 6, we conclude in Section 7. 

For the restriction of space, proofs are omitted in this paper. They are found 
in the full version [7]. 

2 Language 

We define the source and target languages in this section. The source language 
is a call-by-value functional language with primitives for manipulating binary 
trees. The target language is a call- by- value, impure functional language that 
uses imperative streams for input and output. 

Source Language. The syntax and operational semantics of the source lan- 
guage is summarized in Figure 3. 

The meta-variables x and i range over the sets of variables and integers 
respectively. The meta-variable W ranges over the set of values, which con- 
sists of integers i, lambda-abstractions A x.M, and binary-trees V. A binary 
tree V is either a leaf labeled with an integer or a tree with two children, 
(case M of leaf x => Mi | node x\ X 2 => M 2 ) performs case analysis on a 
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Terms, values and evaluation contexts: 



M (terms) 



V (tree values) 

W (values) 

E s (evaluation contexts) 



::= i | A x.M | a: | Mi M 2 | Mi + M 2 \ fix f.M 
leaf M | node Mi M 2 

| (case M of leaf x =4> Mi | node xi x 2 =>• M 2 ) 
::= leaf i | node Vi V 2 
::= i | A x.M | V 

::= [] | E s M | (A x.M) E S \E S +M\i + E s 
| leaf E s | node E s M | node V E s 
| (case E s of leaf x =*• Mi J node *1 x 2 =>■ Af 2 ) 



Reduction rules: 

E a [ii + i 2 ] — > E a [plus(ii,i 2 )] (Es-Plus) 

E s [(Xx.M)W] — > E a [[W/x\M] (Es-App) 

£ s [fix f.M] — > E s [[fix f.M/f]M] (Es-Fix) 

Encase leaf i of leaf x =4- Mi | node xi x 2 => M 2 ] — > E s [{i/x]M\] (Es-Case1) 

E s [case node Vi V 2 of leaf x =4- Mi | node xi x 2 A/ 2 ] — > 

Es[[V 1 /x 1 ,V 2 /x 2 ]M 2 ] 

(Es-Case2) 



Fig. 3. The syntax, evaluation context and reduction rules of the source language. 
plus(ii, i 2 ) is the sum of i\ and i 2 



tree. If M is a leaf, x is bound to its label and M\ is evaluated. Otherwise, x\ 
and X 2 are bound to the left and right children respectively and M 2 is evaluated, 
fix f.M is a recursive function that satisfies / = M. Bound and free variables 
are defined as usual. We assume that a-conversion is implicitly applied so that 
bound variables are always different from each other and free variables. 

We write let x = Mi in M 2 for ( \x.M 2 ) M\. Especially, if M 2 contains no 
free occurrence of x, we write Mi; M 2 for it. 

Target Language. The syntax and operational semantics of the target lan- 
guage is summarized in Figure 4. A stream, represented by the meta variable S, 
is a sequence consisting of leaf, node and integers. We write 0 for the empty 
sequence and write Si: S 2 for the concatenation of the sequences S 1 and S 2 . 

read is a primitive for reading a token (leaf, node, or an integer) from the 
input stream, write is a primitive for writing a value to the output stream. The 
term (case e of leaf => ei | node e 2 ) performs a case analysis on the value 
of e. If e evaluates to leaf, ei is evaluated and if e evaluates to node, e 2 is 
evaluated, fix f.e is a recursive function that satisfies / = e. Bound and free 
variables are defined as usual. 

We write let x = ei in e 2 for (Xx.e 2 ) e\. Especially, if e 2 does not contain x 
as a free variable, we write ei; e 2 for it. 

Figure 5 shows programs that take a tree as an input and calculate the sum 
of leaf elements. The target program assumes that the input stream represents 
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Terms, values and evaluation contexts: 



e (terms) 

v (values) 
Et (evaluation contexts) 



::= v \ x \ e\ e? \ e\ + e 2 \ fix f.e 

read e j write e | (case e of leaf =>• ei | node => e 2 ) 
i | leaf j node | \x.e | () 

[] | Et e | (\x.e) Et \ Et -t- e | i + Et 
| read E t | write E t 
(case Et of leaf =*- ei j node => e 2 ) 



Reduction rules: 



(E t [v i +v 2 ] ,Si,S„) — ♦ (E t [plus{v 1 ,v 2 )\, Si, S 0 ) (Et-Plus) 
(E t [(\x.M)v), St, S 0 ) — ♦ {E t [[v/x\M],Si, S a ) (Et-App) 

(E t [fix f.e], S u S 0 ) — > (E t [[fix f.e/f]e],Si, S a ) (Et-Fix) 

(E t [read()],v; St, S 0 ) — > (E t [v], S it S„) (Et-Read) 

(Et [write r], Si, S 0 ) — ♦ (Et[()], Si, S 0 ; v) (when v is an integer, leaf or node) 

(Et- Write) 

(E t [case leaf of leaf => ei | node => e 2 ],Si,S 0 ) — ♦ (E t [ei], Si, S 0 ) (Et-Case1) 
(Et[case node of leaf => ei | node => e 2 ],Si,S 0 ) — > (E t [e 2 ], Si, S 0 ) (Et-Case2) 



Fig. 4. The reduction rules of the target language 



A source program: 

fix sumtree.Xt . (case t of leaf x =>■ x \ node x\ x 2 ( sumtree xi) + ( sumtree x 2 )) 

A target program: 

fix sumtree.Xt. (case read() of leaf => read() | node sumtree () + sumtree ()) 



Fig. 5. Programs that calculate the sum of leaf elements of an binary tree 



a valid tree. If the input stream is in a wrong format (e.g., when the stream is 
node; 1; 2), the execution gets stuck. 



3 Type System 

In this section, we present a type system of the source language, which guaran- 
tees that a well-typed program reads every node of an input tree exactly once 
from left to right in the depth-first order. Thanks to this guarantee, any well- 
typed program can be translated to an equivalent, stream-processing program 
without changing the structure of the program, as shown in the next section. To 
enforce the depth-first access order on input trees, we use ordered linear types 
[ 12 , 11 ], 
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3.1 Type and Type Environment 

Definition 1 (Type). The set of types , ranged over by r, is defined by: 

r (type) ::= Int | Tree d | n — > t 2 
d, (mode) ::= — | +. 

Int is the type of integers. For a technical reason, we distinguish between 
input trees and output trees by types. We write Tree - for the type of input 
trees, and write Tree+ for the type of output trees, n — > r 2 is the type of 
functions from iq to r 2 . 

We introduce two kinds of type environments for our type system: ordered 
linear type environments and (non-ordered) type environments. 

Definition 2 (Ordered Linear Type Environment). An ordered linear type 
environment is a sequence of the form aq :Tree - , . . . , x n :Tree - , where aq, . . . , a: n 
are different from each other. We write A\, Z\ 2 for the concatenation of A\ and 
A2 . 

An ordered linear type environment x\ : Tree - , . . . , x n : Tree - specifies not 
only that aq, . . . , a: n are bound to trees, but also that each of aq, . . . , x n must 
be accessed exactly once in this order and that each of the subtrees bound to 
xi,...,x n must be accessed in the left-to-right, depth-first order. 

Definition 3 (Non-ordered Type Environment). A (non-ordered) type en- 
vironment is a set of the form {aq :ti, . . . ,x n :T n } where X\, ... ,x n are different 
from each other and {r l5 . . . , t„} does not contain Tree d . 

We use the meta-variable r for non-ordered type environments. We often 
write r,x:r for AJ{i:t}, and write aq : iq, . . . , x n : r„ for {aq : iq, . . . , x n : r„}. 

Note that a non-ordered type environment must not contain variables of 
tree types. Tree - is excluded since input trees must be accessed in the specific 
order. Tree"*" is excluded in order to forbid output trees from being bound to 
variables. For example, we will exclude a program like let aq = t\ in let a: 2 = 
t 2 in node aq X2 when t\ and f 2 have type Tree + . This restriction is convenient 
for ensuring that trees are constructed in the specific (from left to right, and in 
the depth- first manner) order. 

3.2 Type Judgment 

A type judgement is of the form T | A b M : r, where r is a non-ordered type 
environment and A is an ordered linear type environment. The judgment means 
“If M evaluates to a value under an environment described by T and A, the 
value has type r and the variables in A are accessed in the order specified by 
A” For example, if F = {/ : Tree - — > Tree + } and A = aq : Tree - ,a; 2 : Tree - , 

r | A h node (/ aq) (/ x 2 ) : Tree + 



holds, while 



r | A b node (/ x 2 ) (/ aq) : Tree 
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r | x : Tree b x : Tree 

F, x : r | 0 b x : r 
r I 0 b i : Int 
r | x : Tree - b M : t 
r | 0 b A x.M : Tree - — ► t 
r, x : Ti | 0 b M : T 2 
T I 0 b Ax.M : n — > T 2 



(T-Var1) 

(T-Var2) 

(T-Int) 

(T-Abs1) 

(T-Abs2) 



T | Zii b Mi : T2 — > ri T | A 2 b M 2 : T2 



r 


| Z\i,Z\2 1 “ M1M2 • T\ 


r I Ai b Ml : Int r I A 2 b M 2 : Int 


r\ 


zAi, /I2 I - Afi + M2 '• Int 


r,f: 


T\ — ► T2 | 0 h M : n — ► T2 


r 


| 0 b fix f.M : t\ — > T 2 




/• A - M : Int 


T 


| A b leaf M : Tree" 1 " 


r | Ai b Mi 


: Tree + F \ A 2 b M 2 : Tree + 



T | Ai . A2 b node Mi M2 : Tree + 

r | Ai b M : Tree - T, a; : Int | A 2 b Mi : r 
T | Xi : Tree - , * 2 : Tree - , A 2 b M 2 : r 



T | Ai, A 2 b case M of leaf x => A/i | node *1 *2 =b M2 : 



(T-App) 

(T-Plus) 

(T-Fix) 

(T-Leaf) 

(T-Node) 

(T-Case) 



Fig. 6 . Rules of typing judgment 

does not. The latter program violates the restriction specified by A that Xi and 
X 2 must be accessed in this order. 

r | A h M : t is the least relation that is closed under the rules in Figure 6. 
We explain only main rules below. T-Var1, T-Var 2 and T-Int are the rules for 
variables and integer constants. As in ordinary linear type systems, these rules 
prohibit variables that do not occur in a term from occurring in the ordered linear 
type environment. (In other words, weakening is not allowed on an ordered linear 
type environment.) That restriction is necessary to guarantee that each variable 
in an ordered linear type environment is accessed exactly once. 

T-AbsI and T-Abs 2 are rules for lambda abstraction. Note that the ordered 
type environments of the conclusions of these rules must be empty. This restric- 
tion prevents input trees from being stored in function closures. That makes 
it easy to enforce the access order on input trees. For example, without this 
restriction, the function 

At. let g = A /.(/ t) in ( g sumtree ) + ( g sumtree ) 

would be well-typed where sumtree is the function given in Figure 5. However, 
when a tree is passed to this function, its nodes are accessed twice because 
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the function g is called twice. The program above is actually rejected by our 
type system since the closure A /.(/ t) is not well-typed due to the restriction of 
T-ABS2. 1 

T-App is the rule for function application. The ordered linear type envi- 
ronments of Mi and M 2 , A\ and A? respectively, are concatenated in this 
order because when Mi M 2 is evaluated, (1) M\ is first evaluated, (2) M 2 
is then evaluated, and (3) M\ is finally applied to M 2 . In the first step, the 
variables in A\ are accessed in the order specified by A\. In the second and 
third steps, the variables in A 2 are accessed in the order specified by A 2 , On 
the other hand, because there is no restriction on usage of the variables in a 
non-ordered type environment, the same type environment (T) is used for both 
subterms. 

T-Case is the rule for case expressions. If M matches node X\ x 2 , sub- 
trees X\ and x 2 have to be accessed in this order after that. This restriction is 
expressed by x\ : Tree - , 2 : 2 : Tree~,Z\ 2 , the ordered linear type environment 
of M 2 . 



3.3 Examples of Well- Typed Programs 

Figure 7 shows more examples of we 11- typed source programs. The first and sec- 
ond programs (or the catamorphism [8] ) apply the same operation on every node 
of the input tree. (The return value of the function tree-fold, cannot, however, be 
a tree because the value is passed to g.) One can also write functions that pro- 
cess nodes in a non-uniform manner, like the third program in Figure 7 (which 
increments the value of each leaf whose depth is odd) . 

The fourth program takes a tree as an input and returns the right sub- 
tree. Due to the restriction imposed by the type system, the program uses sub- 
functions copy-tree and skip-tree for explicitly copying and skipping trees. 2 (See 
Section 7 for a method for automatically inserting those functions.) 



4 Translation Algorithm 

In this section, we define a translation algorithm for well-typed source programs 
and prove its correctness. 



1 We can relax the restriction by controlling usage of not only trees but also functions, 
as in the resource usage analysis [6]. The resulting type system would, however, 
become very complex. 

2 Due to the restriction that lambda abstractions cannot contain variables of type 
Tree d , we need to introduce sequential composition (;) as a primitive and extend 
typing rules with the following rule: 

r I Ai b Ml : t' r A > t~ M-2 : r t' + Tree d 



r | Ai , A 2 H Mi; M 2 : t. 



(T-Seq) 
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fix tree .map. Xf. Xt. (case t of leaf x => leaf (/ x) 
node xi X2 =$■ node ( tree-map f xi) ( tree-map f *2)) 
fix tree _/ 'old. Xf.Xg.Xt. 

(case t of leaf n => (/ n) 

| node ti £2 => (g ( tree-fold f g ti)(tree-fold f g £2))) 
fix inC-alt.Xt. (case £ of leaf x => leaf x | node x\ X2 =>• node 
(case xi of leaf y =>■ leaf ( y + 1 ) 

| node 2/1 2/2 => node ( inc-dlt yi) ( inc-alt 222)) 

(case X2 of leaf 2 => leaf (2 + 1 ) 

node 21 22 => node ( inc-alt 21) ( inc-alt 22)) 

let copyJ,ree = 

fix copy -tree. Xt. (case £ of leaf x ==> leaf x 

node xi X2 =>• node ( copy-tree xi) ( copy-tree X2)) in 

let skip-tree = 

fix skip -tree. Xt. (case £ of leaf x => 0 

| node xi X2 => ( skip-tree xi); ( copy-tree X2) in 
A£.(case £ of leaf x ==> leaf x j node xi X2 ==> ( skip-tree xi); ( copy-tree X2)) 



Fig. 7 . Examples of well-typed programs 



4.1 Definition and Correctness of Translation 

The translation algorithm A is shown in Figure 8. A maps a source program to 
a target program, preserving the structure of the source program and replacing 
operations on trees with operations on streams. 

The correctness of the translation algorithm A is stated as follows. 

Definition 4 . A function [ • ] from the set of trees to the set of streams is defined 
by: 

[ leaf i ] = leaf ; i 

[ node V 1 V 2 }= node; [ Vi ] ; [ V 2 ] . 

Theorem 1 (Correctness of Translation). 

If 0 | 0 b M : Tree~ — » r and r is Int or Tree + , the following properties hold 
for any tree value V: 

(i) MV — i if and only if [ V], 0 ) — ►* («, 0 , 0 ) 

(ii) M V — V V' if and only if (A(M)Q, [V ], 0 ) — >* ((), 0 ,[F']) . 

The above theorem means that a source program and the corresponding 
target program evaluates to the same value. The clause (i) is for the case where 
the result is an integer, and (ii) is for the case where the result is a tree. 

We give an outline of the proof of Theorem 1 below. Full proofs are found in 
the full version of this paper [7]. We define another reduction semantics of the 
source language and prove that (1) for well-typed programs, the new semantics 
is equivalent to the one in Section 2 and (2) each reduction step based on the 
new semantics has the corresponding one in the target program. 
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A(x) = x 
*4(i) = i 

A(Xx.M) = XxA(M) 

A(MiM 2 ) = A{M\) A{M 2 ) 

A(Mi + M 2 ) = A(Mi) + A{M 2 ) 

^(fix f.M) = fix f.A(M) 

^l(leaf M ) = write(leaf); write(^4(M)) 

^l(node Mi M 2 ) = write(node); A(Mi)\ A(M 2 ) 
^4(case M of leaf x => Mi j node xi x 2 => M 2 ) = 




case «4(M);read() of leaf =>• let 


x = read() in A(M\) 


node =>■ [()/xi,()/x 2 \A(M 2 ) 


Fig. 8. Translation Algorithm 


. (Mx,x^V) (M',5') ••• 


- (U',0) 


( 1 1 


l 


(A(M), [ V ] , 0) — >* (e, [UJjS'o) — (e',S[,S' 0 ) -+ • • • 


(0,0, [V']) 



Fig. 9. Evaluation of a source and the target program 



The new reduction relation is of the form ( M,S ) — ► ( M',5 ') where <5 is a 
sequence of binding of the form x i— > V . The formal definition is given in the 
full version [7]. The only difference from the one defined in Section 2 is that 
input trees are bound in 5 and must be accessed in the order specified by 5. So, 
evaluation based on the new rules can differ from the one in Section 2 only when 
the latter one succeeds while the former one gets stuck due to the restriction 
on access to input trees. The following theorem guarantees that this does not 
happen if the program is well-typed. 

Theorem 2 (soundness of the type system). If 0 | x : Tree - b M : r and 

( M , x i— > V) — >* (M\ S’) hold, then M' is a value or a variable, or ( M' , 5’) — * 
(M", 5") holds for some M ", 5" . 

Using the new semantics, we can prove that each reduction step of a source 
program has the corresponding one in the target program. Figure 9 illustrates the 
idea of the proof (for the case where the result is a tree). In the relation (M, 5) ~ 
(e,Si,S 0 ), e represents the rest of computation, S) is the input stream, and 
S 0 is the already output streams. For example, (node(leaf l)(leaf (2 + 3)),0) 
corresponds to (2 + 3, 0, node; leaf; 1; leaf). The formal definition of ~ is found 
in the full version [7]. ) We can show that (1) the target program A(M) can 
always be reduced to a state corresponding to the inital state of the source 
program M and that (2) reductions and the correspondence relation commute. 
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Those imply that the whole diagram in Figure 9 commutes, so that the source 
program and the target program evaluates to the same value. 

4.2 Efficiency of Translated Programs 

Let M be a source program of type Tree' — > Tree + . We argue below that the 
target program A(M) runs more efficiently than the source program unparse o 
M o parse, where parse is a function that parses the input stream and returns a 
binary tree, and unparse is a function that takes a binary tree as an input and 
writes it to the output stream. Note that the fact that the target program is a 
stream-processing program does not necessarily imply that it is more efficient 
than the source program: In fact, if the translation A were defined by A(M) = 
unparse o M o parse, obviously there would be no improvement. 

Intuitively, the target program being more efficient follows from the fact that 
the translation function A preserves the structure of the source program, with 
only replacing tree constructions with stream outputs, and case analyses on trees 
with stream inputs and case analyses on input tokens. 

In fact, by looking at the proof of Theorem 1, we know (see the full version 
for the reason): 

— The memory space allocated by A(M) is less than the one allocated by 
unparse o M o parse, by the amount of the space for storing the intermedi- 
ate trees output by parse and M (except for an implementation-dependent 
constant factor). 

— The number of computation steps for running A(M) is the same as the one 
for running unparseo Mo parse (up to an implementation-dependent constant 
factor) . 

Thus, our translation is effective especially when the space for evaluating M 
is much smaller than the space for storing input and output trees. 



5 Extensions 

So far, we have focused on a minimal calculus to clarify the essence of our 
framework. This section briefly shows how to extend the framework to be used 
in practice. More details are found in the full version [7]. 

5.1 Constructs for Storing Trees on Memory 

By adding primitives for constructing and clestructing trees on memory, we can 
allow programmers to selectively buffer input/output trees at the cost of effi- 
ciency of target prgorams. Let us extend the syntax of the source and target 
languages as follows: 

M ::= • • • | mleaf M | mnode M\ M 2 

(mease M of mleaf x =>■ M\ j mnode X\ X 2 => M 2 ) 
e ::= • • • | mleaf e j mnode ei e 2 

(mease e of mleaf x => e± \ mnode X\ X 2 => e-^) ■ 
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fix strrri-to-mem. 

At. case t of leaf x => mleaf x 

j node x\ *2 => mnode ( strm-to_mem * 1 ) ( strm-to_mem X 2 ) fix mem-tostrm. 

At. mease t of mleaf x =£■ leaf a: 

| mnode x\ xi =>■ node ( mem-tostrm xi) ( mem-tostrm * 2 ) 



Fig. 10. Definition of strm-to-mem and mem-tostrm 



let mswap = 

fix /.A t. mease f of mleaf x => leaf x 

| mnode xi X 2 => node (/ X 2 ) (/ xi) in 
fix swap_deep.\n.\t. 

if n = 0 then mswap ( strm-to-mem t) 

else 

case t of 

leaf leaf x 

| node xi X 2 => node ( swap-deep ( n — 1) xi) ( swap-deep ( n — 1) X 2 ) 



Fig. 11. A program which swaps children of nodes whose depth is more than n 



Here, mleaf M and mnode Mi M 2 are constructors of trees on memory 
and mease • • • is a destructor. 

We also add type MTree, the type of trees stored on memory. The type 
system imposes no restriction on access order between variables of type MTree 
like type Int (so MTree is put in the ordinary type environment, not the ordered 
linear type environment). The translation algorithm A simply translates a source 
program, preserving the structure: 

^4(mleaf M) = mleaf A(M) 

„4(mnode Mi M 2 ) = mnode „4(Mi) A(M 2 ) 



With these primitives, a function strm to mem, which copies a tree from the 
input stream to memory, and mem-tostrm, which writes a tree on memory to 
the output stream, can be defined as shown in Figure 10. 

Using the functions above, one can write a program that selectively buffers 
only a part of the input tree, while the type system guarantees that the selective 
buffering is correctly performed. For example, the program in Figure 11, which 
swaps children of nodes whose depth is more than n, only buffers the nodes 
whose depth is more than n. 

The proof of Theorem 1 can be easily adapted for the extended language. 



5.2 Side Effects and Multiple Input Trees 

Since our translation algorithm preserves the structure of source programs, the 
translation works in the presence of side effects other than stream inputs/outputs. 
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Our framework can also be easily extended to deal with multiple input trees, 
by introducing pair constructors and refining the type judgment form to r \ 
{si : Ai, ■ ■ ■ ,s n : A n } b M : t where si, . . . , s n are the names of input streams 
and each of A \, . . . , A is an ordered linear type environment. 

5.3 Extention for Dealing with XML 

We discuss below how to extend our method to deal with XML documents. 

The difference between binary trees and XML documents is that the latter 
ones (i) are rose trees and (ii) contain end tags that mark the end of sequences 
in the stream format. The first point can be captured as the difference between 
the following types (we use ML-style type declarations): 

datatype tree = leaf of int I node of tree*tree; 
datatype xmltree = leaf of pcdata 

I node of label * attribute * treelist 
and treelist = nil I cons of xmltree * treelist; 

While the type tree represents binary trees, xmltree represents rose trees. 
Based on the difference between these types, we can replace the case-construct 
of the source language with the following two case-constructs. 

caseElem t of leaf(x) => M 1 | nod e(l,attr,tl) => M 2 
caseSeq tl of nil => M\ | cons (x,xl) =>■ M 2 . 

Typing rules can also be naturally extended. For example, the typing rule for 
the latter construct is: 

r | A\ b tl : treelist r \ A 2 b Mi : r 
r \ x : xmltree, xl : treelist, A 2 b M 2 : r 

r | Ai, A 2 b caseSeq tl of nil =>■ Mi | cons(r, xl) => M 2 : r. 

The restriction on the access order is expressed by x : xmltree, xl : treelist, A 2 
as in T-Node. 

The translation algorithm (1) maps the pattern nil in the source language to 
the pattern for closing tags. (2) prepares a stack and confirms well-formedness 
of input documents. 



6 Related Work 

Nakano and Nishimura [10, 9] proposed a method for translating tree-processing 
programs to stream-processing programs using attribute grammars. In their 
method, programmers write XML processing with an attribute grammar. Then, 
the grammar is composed with parsing and unparsing grammars by using the 
descriptional composition [4] and translated to a grammar that directly deals 
with streams. Quasi-SSUR condition in [10] and single use requirement in [9], 
which force attributes of non-terminal symbols to be used at most once, seems to 
correspond to our linearity restriction on variables of tree types, but there seems 
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N — > node Ni N2 

Ni.inh = /1 N.inh, N2-inh = /2 N.inh Ni.syn Ni.inh 

N.syn = fs N.inh Ni.syn Ni.inh N2.syn N2.inh 
N — > leaf i 

N.syn = fi N.inh i 

fix f.Xinh.Xt . case t of 

leaf x =*• /4 inh x 

node xi X2 => let Ni.inh = fi inh in 

let Ni.syn = f Ni.inh xi in 

let N2-inh = f2 N.inh Ni.syn Ni.inh in 

let N2.syn = / N2.inh X2 in 

fs N.inh Ni.syn Ni.inh N2.syn N2.inh 



Fig. 12 . L-attributed grammar over binary trees and corresponding program 



to be no restriction that corresponds to our order restriction. As a result, their 
method can deal with programs (written as attribute grammars) that violate the 
order restriction of our type system, although in that case, generated stream- 
processing programs store a part of trees in memory, so that the translation may 
not improve the efficiency. On the other hand, an advantage of our method is 
that programs are easier to read and write since one can write programs as ordi- 
nary functional programs except for the restriction imposed by the type system, 
rather than as attribute grammars. Another advantage of our method is that 
we can deal with source programs that involve side-effects (e.g. programs that 
print the value of every leaf) while that seems difficult in their method based on 
attribute grammars (since the order is important for side effects). 

The class of well-typed programs in our language seems to be closely related 
to the class of L-attributed grammars [1]. In fact, any L-attributed grammar 
over the binary tree can be expressed as a program as shown in Figure 12. If 
output trees are not used in attributes, the program is well- typed. Conversely, 
any program that is well-typed in our language seems to be definable as an 
L-attribute grammar. The corresponding attribute grammar may, however, be 
awkward, since one has to encode control information into attributes. 

There are many studies on program transformation [14, 8] for eliminating 
intermediate data structures of functional programs, known as deforestation or 
fusion. Although the goal of our translation is also to remove intermediate data 
structures from unparse 0/0 parse, the previous methods are not directly appli- 
cable since those methods do not guarantee that transformed programs access 
inputs in a stream-processing manner. In fact, swap in Figure 2, which violates 
the access order, can be expressed as a treeless program [14] or a catamor- 
phism [8], but the result of deforestation is not an expected stream-processing 
program. 

Actually, there are many similarities between the restriction of treeless pro- 
gram [14] and that of our type system. In treeless programs, (1) variables have 
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to occur only once, and (2) only variables can be passed to functions. (1) corre- 
sponds to the linearity restriction of our type system. (2) is the restriction for 
prohibiting trees generated in programs to be passed to functions, which corre- 
sponds to the restriction that functions cannot take values of type Tree + in our 
type system. The main differences are: 

— Our type system additionally imposes a restriction on the access order. This 
is required to guarantee that translated programs read input streams se- 
quentially. 

— We restrict programs with a type system, while the restriction on treeless 
programs is syntactic. Our type-based approach enables us to deal with 
higher-order functions. The type-based approach is also useful for automatic 
inference of selective buffering of trees, as discussed in Section 7. 

The type system we used in this paper is based on the ordered linear logic 
proposed by Polakow [12]. He proposed a logic programming language Olli, log- 
ical framework OLF and ordered lambda calculus based on the logic. There are 
many similarities between our typing rules and his derivation rules for the or- 
dered linear logic. For example, our type judgment r \ A h M : r corresponds 
to the judgment b A of ordered linear logic. The rule T-Abs1 corre- 

sponds to a combination of the rules for an ordered linear implication and the 
modality (!). However, we cannot use ordered linear logic directly since it would 
make our type system unsound. Petersen et al. [11] used ordered linear types to 
guarantee correctness of memory allocation and data layout. While they used an 
ordered linear type environment to express a spatial order, we used it to express 
a temporal order. 



7 Conclusion 

We have proposed a type system based on ordered linear types to enable trans- 
lation of tree-processing programs into stream-processing programs, and proved 
the correctness of the translation. 

As we stated in Section 3 and 5, one can write tree-processing programs that 
selectively skip and/or buffer trees by using skip -tree, copy -tree, strui-to-mem 
and mem-tostrm. However, inserting those functions by hand is sometimes te- 
dious. We are currently studying a type-directed, source-to-source translation 
for automatically inserting these functions. 

In addition to application to XML processing, our translation framework may 
also be useful for optimization of distributed programs that process and commu- 
nicate complex data structures. Serialization/unserialization of data correspond 
to unparsing/parsing in Figure 1, so that our translation framework can be used 
for eliminating intermediate data structures and processing serialized data di- 
rectly. 
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Abstract. We introduce a novel implementation of subtyping among 
regular expression types in terms of Haskell-style type classes by making 
use of some modest type class extensions. We assume that each regular 
expression type has some underlying structured runtime representation. 
Hence, we not only check for the containment problem among regular 
expressions, but also automatically derive some appropriate casting func- 
tions among the underlying structured values. We believe that this work 
is the first step in adding type-safe XML support to languages such as 
Haskell. 



1 Introduction 

Regular expression types and regular expression pattern matching have been 
introduced in XML processing languages to provide type-safe transformation of 
XML documents. Languages based on these ideas such as XDuce [11,10] and 
CDuce [3] provide for some static safety guarantees. For example, assume func- 
tion f is of type Ri — ■» i ?2 where R± and i ?2 refer to some regular expression 
types. Assume that we apply f to some value v of another regular expression 
type i? 3 . For this application to be safe we need to verify that L(R 3 ), the lan- 
guage denoted by i? 3 , is a subset of L{R\). Hence, (sub)type checking in XDuce 
and CDuce boils down to checking for containment among regular expressions. 
Commonly, this is referred to as semantic subtyping. 

Traditionally, at runtime the structure imposed on values by regular expres- 
sion types is lost. XDuce and CDuce make use of a uniform (a.k.a. “flat”) runtime 
representation for values. The downside of such an approach is that a flat rep- 
resentation of values might incur some performance penalties. Furthermore, it 
becomes harder to introduce XDuce features into languages such as Haskell [17] 
unless we enforce that all Haskell values obey the flat representation. 

Here, we consider a novel approach to implement semantic sub typing such 
that we retain a structured runtime representation of values defined by regu- 
lar expression types. A key feature is that our implementation can be encoded 
in terms of Haskell type classes. Thus, Haskell’s type system and compilation 
scheme is expressive enough to integrate regular expression types into Haskell 

W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 57-73, 2004. 
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without any significant changes to existing Haskell programming systems such 
as GHC [7]. 

The problem we are facing is as follows. Consider a structured value vl of 
regular expression type R\. Assume that we intend to use vl in a context where 
vl is expected to have type R 2 , L(R\) C L(R 2 ). Such uses of vl are safe. The 
standard method to check for containment among regular expression proceeds 
by converting regular expressions to DFAs, minimizing the DFAs etc. This gives 
us a proof method to check whether L(Ri) C L(R 2 ) holds. However, we need 
more than the proof. The challenge here is that in order to use vl in type 
context R 2 we need to change the structure of vl. That is, we need to (up)cast 
value vl defined by R\ into a value v2 defined by R 2 . Such an upcast function 
must exist because L(Ri) C L(R 2 ) holds. However, the standard method for 
checking containment among regular expressions does not give us any clues how 
to construct the appropriate upcast function. 

Our idea is to make use of a mostly overlooked algebraic proof method to 
check for language containment among regular expressions. Antimirov [2] intro- 
duced a proof system with judgments of the form b Ri < R 2 to decide whether 
L(R\) C L(R 2 ). We model such judgments in terms of the following Haskell 
class declaration 

class UCast rl r2 hi h2 I rl r2 -> hi h2 where ucast : :hl->h2 

Our task is to define the set of valid instances such that UCast rl r2 hi 
h2 is valid iff b rl < r2 is provable in Antimirov’s system. As a side product, 
the type class resolution mechanism [8] generates some “evidence” for 
ucast: :hl->h2 where hi and h2 refer to the underlying representations of rl 
and r2. That is, out of the proof for UCast rl r2 hi h2 we obtain a proof 
term for ucast : :hl->h2. Note that in the above class declaration we make use 
of a functional dependency [12] I rl r2 -> hi h2 which states that hi and 
h2 are uniquely determined by rl and r2. Antimirov describes the set of valid 
judgments b R\ < R 2 in terms of rules of the form 

, . b Ri < Ri . . . b R n < R' 

W b R < R' 

Such rules can directly be expressed in terms of Haskell instance declarations. 
Here is a type class encoding of the above rule (r). For simplicity, we have left 
out the instance body which describes the shape of the corresponding ucast 
function. 

instance (UCast R\ R[ Hi H[, . . . , UCast R n R' n H n H ' n ) 

UCast R R' H H' 

Of course, there are some tricky parts behind our type class encoding of 
Antimirov’s algorithm. There will be more about this in later sections. 

We continue in Section 2 where we introduce Antimirov’s algorithm. In Sec- 
tion 3 we show how to model semantic subtyping among regular expression 
types using type classes. In Section 4 we consider regular hedges (i.e., sequences 
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of trees) which requires a novel adaption of Antimirov’s algorithm. In Section 5 
we discuss related work and conclude. We assume that the reader is familiar 
with Haskell type classes [7]. 

2 Background: Antimirov’s Algorithm 

We assume that the reader is familiar with the basics of regular expressions. E.g., 
we consider R + as a short-hand for ( R, R *) and f?? as a short-hand for (()|i?) 
where () denotes the empty word. Regular expressions are defined as follows. 

Literals l ::= || ... 

Regular Expression R ::= () || l || R* || || (R,R) 

Note that we use || in BNF to eliminate the confusion with regular ex- 
pression operator | . Antimirov’s algorithm to check containment among regular 
expressions is defined in terms of a co-inductive term-rewriting system among 
judgments b R\ < R 2 where regular expressions R\ and R 2 are put into “normal 
form” if necessary. Before providing a formal account of Antimirov’s algorithm 
we take a look at an example. 

Example 1. Consider the goal b ((A*,B*),B*) < ( A*,B *) plus some proof 
rules. 

/ ^ b ( Ri,R 2 ) < R 3 b () < f ?4 b Ri < R 3 b R 2 < R 4 

b (Ri,R 2 ) < (R 3 ,Ra) b (R 1 ,R 2 ) < (R 3 ,R4) 

, ^ 0 < Rs ^ (-Ri,fl 2 ) < R 4 True 

{, ' y b (R 1 ,R 2 ) < (R 3 ,Ra) " b () < R* 

We have that L((A*, B*),B*) C L(A*, B*). However, the above rules are not 
strong enough to derive this statement. In the following, we write C — > r C 
to denote application of rule (r) to a set of judgments C yielding another set of 
judgments C' . E.g., we find that 

b ((A*, B*), B*) < ( A*,B *) — ♦ _,i b ((A*,B*), B*) < A* , b () < B* — Fail 

— >’_, 2 b ( A*,B *) < A*, b B* < B* — >* Fail 
— y-, 3 b () < A*, b {{A*,B*),B*) < B* — b Fail 

In the first step, we encounter three alternatives. After exploring all possi- 
bilities none of the derivations can be reduced to the empty store. Hence, we 
report Failure. Of course, we could now simply add more proof rules. Currently, 
we do not exploit the following equivalences ((A*, B*), B*) = (A*, (B*,B*)) = 
(A*, B*). However, incorporating all such equivalences into the reduction system 
would lead to non-termination. We cannot impose a definite order (e.g. from left 
to right) on our rewrite rules. 
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Apply if all other b Ri < R2 failed: 

Ni =norm(_Ri) N2 = norm(i?2) l~ i n f Ah < N2 

(Last) 

b R 1 < R 2 

Containment rules among deterministic LNFs: 



( 0 - 0 ) 



True 



hnf 0 < 0 



l“lnf ATi < A/3 h — I2 

(|--) binf N2 < N3 (m-ml) b Ri < R2 



Olnf (JVrIJVa) < N 3 



binf < ( h,R 2 ) 



(--10 



!' Inf Ah < A r 2 

bi„f AT, < (JV 2 |JV 3 ) 



(--1 2) 



• Ini' Ah < A^3 

binf AT, < (Ah ^3) 



Fig. 1 . Essence of Antimirov’s Algorithm 



To address this problem, Antimirov’s idea is to normalize regular expressions 
to Linear Normal Forms (LNF) in case a judgement b R± < R 2 cannot be 
verified. 

(Monomial) M ::= ( l,R ) 

(Sum) S ::= M || (M\S) 

(LNF) N ::= S || (Q|5) 

A LNF N is deterministic iff none of the monomials in N shares a common 
leading literal. In [2], Antimirov shows that every regular expression can be 
rewritten into a deterministic LNF. E.g., norm((Z, i?i)|(Z, R 2 )) = (/, (i?i|i? 2 )) 
where function norm turns regular expressions into deterministic LNF. Details 
can be found in [15]. 

Antimirov’s proof system has two layers. The first layer consists of proof 
rules among “ordinary” expressions. Often, we call these rules “ordinary” proof 
rules. We refer to [2] for a description of these rules. As already observed, we 
cannot find a finite set of ordinary proof system such that we obtain a complete 
proof system. Therefore, Antimirov introduces a second proof system among 
normalized expressions. We refer to Figure 2 for details. Rule (Last) allows us to 
switch from the ordinary proof system b to the normalized proof system bi n f . 
We assume that we have exhaustively applied all b proof rules. In the bi nf 
system we only need to consider normalized regular expressions. In rule (m-ml), 
we assume that the leading literals of the monomial on the llrs and rlrs are equal. 
We drop the literals and switch back to the b proof system. In case the leading 
literals differ we need to backtrack. Rule (()-()) deals with the empty type (). 
Rules (|-_), (_-|l) and (_-|2) deal with the union type |. Note that N^s are in 
LNF. Hence, we make use of the bi n f proof system in the premise of these rules. 

Note that Antimirov’s proof rules have to be interpreted co-inductively. That 
is, if we encounter a judgement which we are trying to verify we simply assume 
that the judgement holds. 
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Example 2. Consider b ((A*,B*),B*) < (A*, B*) which cannot be verified with 
the ordinary proof rules. Note that throughout the following derivation, we al- 
ways pick the first judgment in the constraint store for reduction. We find that 

b {{A*,B*),B*) < ( A*,B *) 

—Last hnf ()|(A((A*,B*),B*))|(B,((B*,B*)|B*)) < ()\(A,(A*,B*))\(B,B*) 
— hnf 0<Q\(A,(A*,B*))\(B,B*) 

bi nf (A,((A*,B*),B*))\(B,((B*,B*)\B*)) < ()|(A, (A* , B*))\(B, B*) 



C-|l hnf () < () 

b lnf (A,((A*,B*),B*))\(B,((B*,B*)\B*)) < ()|(A, (A*,B*))\(B,B*) 
— 0-0 hnf (A,((A*,B*),B*))\(B,((B*,B*)\B*)) < ()|(A, (A*,B*))\(B, B*) 
— »|-_ hnf (A, ((A*, £*),£*)) < ()| (A, (A* , B*))\(B , B*) 

hnf (£?,((£?*, B*)|B*)) < ()| (A, (A* , B*))\(B, B*) 

— »__|i Fail 

—\-\2 hnf (A, ((A* , B*), B*)) < (A, (A* , B*))\(B, B*) 
bi nf (B,((B*,B*)\B*)) < OKA, (A*,B*))|(B,B*) 

— _-|i hnf (A, ((A*, B*),B*)) < (A, (A*,B*)) 

b lnf (B,((B*,B*)\B*)) < ()| (A, (A* , B*))\(B, B*) 

> m-ml h ((A*,B*),B*) < ( A*,B *) 

b lnf (B,((B*,B*)\B*)) < ()I(A,(A*,B*))I(B,B*) 

— coiud hnf (B,((B*,B*)\B*))<Q\(A,(A*,B*))\(B,B*) 

— > True 

Note that during the derivation we encounter the judgment b ((A*, B*), B*) < 
(A* , B*) again which is resolved by applying the co-induction principle. 



Theorem 1 (Antimirov). Let R\ and f ?2 h two regular expressions. Then, 
L(Ri) C L(R 2 ) iff b Ri < R 2 is derivable in Antimirov’s proof system. 



3 Regular Expressions 

We show how to encode Antimirov’s algorithm in terms of Haskell type classes. 
First, we need to define an appropriate Haskell representation for regular ex- 
pression types and their underlying structured representation. We employ the 
following singleton types. 

data A = A 
data B = B 

data STAR x = STAR x 
data OR x y = OR x y 

E.g., we represent the regular expression (A|H)* in terms of the Haskell type 
(STAR (OR A B) ) . For convenience, we often make use of the more familiar 
regular expression notation. 
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The translation function [_] from regular expression types to underlying 
structured representation types is adopted from Wallace and Runciman [22], 

[()] = [Phi] [R*] = [[-R]] 

W =1-H l(Ri\R*)] =0r [Ril m 

m = B I(R 1 ,R 2 )] = ([R 1 ],[R 2 ]) 

where data Phi and data Or a b = L a I R b. Note that () is represented 
by the Haskell list type [Phi] where Phi has no associated value. Hence, the 
only Haskell value belonging to [Phi] is the empty list []. For each literal l we 
introduce data LH = LH. E.g., A is translated to data A_H = A_H. 

We note that function [_] can be modelled via type classes. 

class REtoHT r h I r -> h 
instance REtoHT () [Phi] 



Our type class representation of judgments h i?i < i? 2 is as follows. 

class ( REtoHT rl hi, REtoHT r2 h2 

) => UCast rl r2 hi h2 I rl r2 -> hi h2 where 
ucast : : rl -> r2 -> hi -> h2 

In contrast to the Introduction function ucast expects now two additional 
input values which refer to regular expressions. 

Encoding of the b proof system in terms of instance declarations is straight- 
forward. E.g., consider rules (,-,l), (,-,2) and (,-,3) from Example 1. Immediately, 
we obtain 

instance ( UCast (rl,r2) r3 (hl,h2) h3, UCast () r4 [Phi] h4 

) => UCast (rl,r2) (r3,r4) (hl,h2) (h3,h4) where — (1) 

ucast (rl,r2) (r3,r4) (hl,h2) = 

((ucast (rl,r2) r3 (hl,h2)), (ucast () r4 [] ) ) 
instance ( UCast rl r3 hi h3 , UCast r2 r4 h2 h4 

) => UCast (rl,r2) (r3,r4) (hl,h2) (h3,h4) where — (2) 

ucast (rl,r2) (r3,r4) (hi ,h2)=( (ucast rl r3 hi) , (ucast r2 r4 h2)) 
instance ( UCast () r3 [Phi] h3, UCast (rl,r2) r4 (hl,h2) h4 

) => UCast (rl,r2) (r3,r4) (hl,h2) (h3,h4) where — (3) 

ucast (rl,r2) (r3,r4) (hl,h2) = 

((ucast () r3 [] ) , (ucast (rl,r2) r4 (hl,h2))) 

The experienced Haskell programmer will notice that the above instances are 
overlapping. Note that we can unify the instance heads (rlrs of =>). In Haskell the 
common assumption is that instance declarations must be non-overlapping. This 
usually guarantees that there is a unique evidence translation for each type class 
constraint. In order to encode Antimirov’s algorithm we require for a modest 
extension of type classes. 

Definition 1 (Alternative Instances). For each instance TC tl ... tm we 

allow for a number of alternatives such as 
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instance 


Cl => TC 


tl . 


. . tm where . . . 


— (1) 


instance 


Cn => TC 


tl . 


. . tm where . . . 


— (n) 



We assume that type class resolution proceeds by first considering instance 
(1). In case of failure we try instance (2) and so on. 

Informally, type class resolution applies rule specified by instance and class 
declarations to a set of constraints yielding another set. A set of constraint is 
called final if no further rules can be applied. A path is a sequence ride appli- 
cations yielding from an initial constraint store to some final store. A set of 
constraints is called successful if False (the always unsatisfiable constraint) is 
not part of the set. 

Assume that TC\,...,TCk is the set of constraint symbols with an alternative 
number of instances. We say a set of constraints fails iff the set is unsuccessful, 
or we find TCi ti for some i in the set. Type class resolution fails if the final 
constraint store fails. 

We note that the alternative instance extension will be problematic in case 
of separate compilation. However, we assume that all “alternative” instances are 
kept in the same module. 

Example 3. Consider verifying b A < (H|H). Here are the relevant ordinary 
proof rules 



(-- | 1 ) 



b R\ < R.2 



(--| 2 ) 



b i?i < i?3 



or 



b R\ < -R 2 I-R 3 b Ri < -R 2 I-R 3 

There are two possible proof trees, 
b A < (A\A) 

b A < (A\A) 



(Taut) 



b True 
b R< R 



_L 

T 


1 

VI 


Taut True 


(1) 


|2 h 


VI 


*Taut True 


(2) 



In terms of our type class encoding, we can derive two versions of ucast on 
the same type. Here are the type class encodings of rules (_ — |1) and (_ — |2). 



instance ( UCast rl r2 hi h2 

) => UCast rl (r2|r3) hi (Or h2 h3) where 
ucast rl (r2|r3) hi = L (ucast rl r2 hi) 



( 1 ) 



instance ( UCast rl r3 hi h3 

) => DCast rl (r2|r3) hi (Or h2 h3) where — (2) 
ucast rl (r2|r3) h2 = R (ucast rl r3 hi) 

Depending on which instances we pick, we can either inject the value into the 
left or right component. To avoid such ambiguity problems we impose a textual 
order among instances. Based on this assumption, proof-term construction is 
deterministic. 
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To encode the I— i n f system we introduce an additional type class. 

class ( REtoHT nl hi , REtoHT n2 h2 

) => NUCast nl n2 hi h2 I nl n2 -> hi h2 where 
nucast : : nl -> n2 -> hi -> h2 

The encoding of the I— i n f proof rules is yet again fairly straightforward. E.g. , 
we have the following instance to express rule (()-()) in Figure 1, 

instance NUCast () () [Phi] [Phi] where 
nucast () () [] = [] 

However, the encoding of rule (Last) turns out to be more tricky. The in- 
stance UCast rl r2 hi h2 modeling rule (Last) must imply the existence of a 
ucast : :hl->h2 function. So far, our encoding of the the premise b] n f nl < n2 
only guarantees the existence of nucast: :hl’->h2’ where hi’ and h2’ are the 
structured representation of the LNFs nl and n2 of rl and r2. Fortunately, we 
have that there is a bijection between the structured representations of regular 
expressions and their LNFs. Hence, we first compute the LNFs nl and n2 of rl 
and r2. We then apply the bijection function between rl and nl and convert hi 
into hi’. Finally we apply the nucast: :hl’->h2’ and convert h2’ into h2 by 
applying the bijection function between n2 and r2. In terms of type classes, this 
can be phrased as follows. 

— norm(r)=n 

class Norm r n I r -> n where 
norm : : r -> n 

— bijection among underlying representation of r and norm(r) 

class ( REtoHT r h , REtoHT n h’ , Norm r n 

) => NormV r n h h’ I r -> n, r n -> h h’ where 

toNorm : : r -> n -> h -> h’ 

fromNorm : : n -> r -> IT -> h 

— last instance 

instance ( Norm rl nl , Norm r2 r2’, NormV rl nl hi hi’ 

, NormV r2 n2 h2 h2’, NUCast nl n2 hi’ h2” 

) => UCast rl r2 hi h2 where 
ucast rl r2 hi = let nl = norm rl 

n2 = norm r2 
hi’ = toNorm rl nl hi 
h2 ’ = nucast nl n2 hi’ 
in fromNorm n2 r2 h2’ 

Of course, we yet have to provide the appropriate instances for Norm and 
NormV. Note that both functions can readily be defined in terms of some first- 
order functional programs. Via some type class acrobatic we can easily encode 
their functionality on the level of types. The observant reader will notice that 
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the above instance violates the “termination” condition imposed on functional 
dependencies [5]. Potentially, this might lead to undecidable type inference. How- 
ever, the above use of “non-terminating” instances does not endanger decidable 
type inference in case of our particular type class application. 

What remains is to extend type class resolution with the co-inductive proof 
principle. 

Definition 2 (Co-inductive Type Class Resolution). Informally, when- 
ever we encounter a constraint we have seen before on the path yielding the 
current constraint store, we simply remove this constraint. 

Such an extension can be easily incorporated into existing implementations 
and is already available in the CVS version of GHC [7]. 

A complete description of all instances to encode Antimirov’s algorithm can 
be found in [15]. 

Theorem 2. Antimirov’s algorithm can be fully and faithfully encoded in terms 
of Haskell type classes by making use of alternative instances and co-inductive 
type class resolution. Additionally, we are able to construct proof terms repre- 
senting the appropriate casting functions. 

Analogue to UCast we can also describe down-casting in terms of type classes. 

class ( REtoHT rl hi, REtoHT r2 h2 

) => DCast rl r2 hi h2 I rl r2 -> hi h2 where 
dcast : : rl -> r2 -> hi -> Maybe h2 

We assume that DCast rl r2 hi h2 is valid iff h r2 < rl and we can 
“down-cast” hi into h2. Note that there is no difference on the level of types. 
However, there is a crucial difference on the value level. Down-casting may not 
be always successful. Hence, we introduce data Maybe a = Just a I Nothing 
to report failure if necessary. 

Example 4- Consider the derivation b A < (Q|A) — >__| 2 b A < A — >Taut 
True which makes use of the rules from Example 3, The corresponding DCast 
instances are as follows, 

instance ( DCast r2 r3 h2 h3 

) => DCast (rl|r2) r3 (Or hi h2) h3 where 
dcast (rl|r2) r3 (R h2) = dcast r2 r3 h2 h3 
dcast (L _) = Nothing 

instance DCast r r h h where 
dcast h = Just h 

Then, dcast (01 A) A (L [] ) yields Nothing whereas dcast (01 A) A 
(R A_H) yields Just A_H. 

An important property is that the down-casting of an upcasted value yields 
back the original value, i.e. dcast Rl R2 (ucast R2 Rl v) = Just v. In order 
to guarantee this property we simply put the DCast instances in the exact same 
order as their UCast counter part. 
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4 Regular Hedges 



In the context of type-safe XML processing languages such as XDuce and CDuce 
types are described in terms of regular hedges (sequences of trees) rather than 
regular expressions. 

Example 5. Here is a XDuce example [10] defining an address book type. 

type Addrbook = addrbook [Person*] 

type Person = person [Name, Tel?, Email*] 

type Name = name [String] 

type Tel = tel [String] 

type Email = email [String] 

Note that regular expressions types can appear under labels, e.g. The type 
Person* denotes a sequence of trees of type Person. 

In the following we show how the results from Section 3 can be extended to 
deal with regular hedges. 

The language of regular hedges is as follows. 

Regular Hedge R ::= () || l[R] || R* || (R\R) || (R,R) 

A regular hedge l[R] refers to a regular expression R under a label l. It is 
common to say that the regular expression R is guarded by label l. For simplicity, 
we omit base types like String, but they can be easily incorporated. The Haskell 
translation of regular hedge requires a minor change in the translation of regular 
expression, E.g., l[R] is translated to data l _ H = Z_H H where H is the translation 
of R. 

In order to accommodate regular hedges we first need to extend Antimirov’s 
proof system. We replace all occurrences of literal l by label type Z[i?] in the 
ordinary proof rules. This change implies that we need the following additional 
rule 

h = h b Ri < R‘i 
b h[Ri] < h[R* 

The changes required for the bi n f proof system are more substantial. Regular 
hedges make it necessary to adapt to our notion of deterministic LNF. 

Example 6. Consider 



b (hlRMiRWR'z)) < (Zi[i?i],R / 1 )|(Z 2 [ J R 2 ], J R')|(Zi[R 3 ],i?3)ia2[i?4],^) 



Ihs 



rhs 



Assume that we extend the definition of monomials by replacing l[R] for l. Note 
that there is no deterministic LNF for the rhs. E.g., (Zi[i?i], R'i) and (Zif/fe], R' 3 ) 
on the rhs share the same label. Relaxing the notion of deterministic LNF is not 
an option. Otherwise, Antimirov’s algorithm becomes incomplete. 
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As a solution we assume that all monomials with a common label are grouped 
under a union. We refer to the resulting expression as the Common-Labelled, 
Union (CLU). Here is the extended definition of normal forms. 

Monomial M ::=(l[R],R') 

Sum S ::= M || ( M\S ) 

LNF N ::= S || (()|S) 

CLU CLU(l) ::= \ je j{l[Rj], Rj ) 

CLU-Sum C ::=CLU(h) \\ (CLU(l 2 )\C) 

CLU-Sum-LNF B ::= C || (()|C7) 

We say a CLU-Sum-LNF is deterministic iff none of the CLUs shares a com- 
mon label. Note that we do not require LNFs to be deterministic anymore. 

E.g., the deterministic CLU-Sum-LNF of the rhs in the above example is as 
follows 

I ((W^KWX)) 

S S V X 

CLU{h) CLU{1 2 ) 

What remains is to check each monomial on the lhs against the CLU sum. 
That is, 

F i n f lhs < CLU{1\) V I - i n f lhs < CLU(l 2 ) 

Additionally, we rephrase rule (m-ml) from Figure 1 as follows. 

l\ = l 2 b Ri < i?2 b R 'y < R 2 
m-ml — 

binf (h[R[},Ri) < {k[R! 2 \,Ri) 

However, the resulting proof system is still not strong enough. 

Example 7. Consider the following inequality, 

(«[i? 1 |i? 2 ],(i? , 1 |^)) < (llR^R'MR^R'z) 

The lhs is contained in the rhs, but neither in (7[!?-|], R .\ ) nor (l[R 2 ],R 2 ). 
Hence, our current adaption of Antimirov’s proof system fails, in other words, 
it is still incomplete. 

Applying the distributivity law on the lhs might help in this example. How- 
ever, in case there is a type guarded by the label on the lhs, this potentially 
leads to non-termination. 

Such problems have already been recognized by Hosoya [9] in the context 
of a slightly different algorithm for containment checking. Hosoya established a 
powerful law (originally from a lemma in [1]) which allowed him to establish 
completeness for his algorithm. 
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Apply if all other b Ri < R 2 failed: 

A r i = norm'(_Ri) B 2 = clu o norm^ife) I - i n f N\ < B2 

(Last) 

Rl < R 2 

Containment rules among LNFs and CLUs: 



(m-cl) 



h=h t = S , |T|i? 1 | . . . \R n 
for each I C [1, . . . , n], I = [1, . . . , n]\I 
H (h[S],T) < (l 2 [\ ieI R4,r)\(l 2 [T],\ iGl R') 
b lnf (h[S],T) < (l[R 1 ],R' 1 )\...\(l[R n ],R' n ) 



((H)) 



True 

h lnf 0 < 0 



(-| 1 ) — Nl - B2 (- 12 ) 

■ I„f .Vi < (B 2 \B 3 ) 



fclnf ATi < B 3 bi n f N 2 < B 3 
h}„f (iVijiVb) < B 3 
h} n f Ni < B 3 
hlnf Ni < (B 2 \B 3 ) 



Fig. 2. Essence of Antimirov’s Algorithm extended with Hosoya’s Law 



Theorem 3 (Hosoya2000). For any regular expression S,T, Ri, . . . R n . We 
have that 



L(l[ff],T)CL((l[R 1 ],R' 1 )\...\(l[R n ],R' n )) 

_iff 

for each I C {1, . . . , ?r} 7 / = {1, . . . , n}\I we have that 
L(l[S],T) C L((l[\ ieI R i ],T)\(l[T},\ ieT R')) 

where r = S'|T|i? 1 | . . . \R n is the maximal type. 

Based on the above Theorem we extend Antimirov’s algorithm to regular 
hedges. We refer to Figure 2 for details. In rule (Last) we compute the determin- 
istic CLU-Sum-LNF of the rlrs via function clu. Function norm’ is a simplified 
version of function norm where we do not require LNFs to be deterministic. 

In rule (m-cl) we make use of Hosoya’s Theorem. Rule (()-()) deals with the 
empty type (). Rules (|-_), (--|1) and (_-|2) deal with the union type |. Note that 
(Bi\B 2 ) denotes a union of unions (i.e. a union of CLUs), and CLU(l) denotes 
a union of monomials. Hence there is no overlap with rules (m-cl) and (m-c2). 

In fact, as we have learnt recently, a similar extension of Antimirov’s algo- 
rithm has been considered by Kempa and Linnemann [13]. However, the crucial 
difference is that they do not consider how to derive proof terms, i.e., casting 
functions out of proofs. Note that it is by no means obvious how to define An- 
timirov’s extended proof system in terms of type classes. Fortunately, we can 
apply a trick to achieve a type class encoding. 
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Example 8. Consider bi„f l[S],T < R' 1 )\(l[R, 2 ], R' 2 ) (0). The only rule 

applicable is (m-cl). We conclude that the above holds iff we can verify that 

b /[5],T<^],r)|(/[r],(^i| J R' 2 )) A(l) 
h l[S],T < (1[Ri\,t)\(1[t], R' 2 ) A (2) 
b l[S],T< {1[R 2 ],t)\(1[t], R[) A (3) 
b l[S],T< (/[ J R 1 |i? 2 ],r)|(/[r]^) (4) 

where r = \R[ |A 2 |A 2 and </> denotes the empty language. Silently, we 

assume that none of the regular expressions S, T, R[, R 2 and R' 2 denotes 
the empty language. We can enforce this assumption via a emptiness check. Note 
that (0) has been transformed into four sub-statements. The question is which of 
the sub-proof terms corresponding to (1-4) shall we pick to construct nucastO : : 

(i[s],T) - im,T) ] - miRihR'imRib^w 

It seems we are stuck. But wait! Consider statements (1) and (4). We have 
that neither b (Z[5],T) < (l[(j>], (R[ |i? 2 )) nor ^ (/ [5*] , X 1 ) < (/[l?i|A 2 ], 4 1 ) holds. 
Hence, (1) and (4) are equivalent to b ((I[5],T) < (Z[t], and ^ " 

((Z[5],T) < (Z[i?i |i? 2 ], t)). Hosoya observed that (Z[i?],r) IT (Z[r],f?') = {l[R],R') 
holds in general 1 . Hence, (1) and (4) are in fact equivalent to b (/[.S'], T) < 
(/[f?i|f? 2 ], (R[\R 2 )). In summary, (1-4) are equivalent to 

b l[S\,T< Z[i?i|i? 2 ], {R[\R' 2 ) A (1&4) 

b l[S],T< (llR^rmr],^) A (2) 

b 1[S],T<(1[R 2 \,t)\(1[t],R[) A (3) 

b (/[l? 1 ],l?' 1 )|(/[f? 2 ],l?')<(/[l?. 1 |l? 2 ],(f?' 1 |l?')) (5) 

Note that we added the somewhat redundant statement (5) (which is is a 
tautology). This assumption will become essential to construct nucastO. 

In the following reasoning steps, we neglect the underlying representation 
types for simplicity. Assumption (1&4) implies the existence of a 

ucastlAnd4: :(Z[S],T) -4 (l[Ri\R 2 ], (iJjJi?^)) 

Assumption (5) implies 

dcast5: : (l[Ri\R 2 ], (R[\R 2 )) - (Z[Ar], ^)|(Z[A 2 ], A' 2 ) 

The remaining assumptions (2) and (3) guarantee that clown-casting via 
dcast5 will never fail. Hence, 

nucastO v = let Just v ’ = dcast5 (ucastlAnd4 v) 
in v’ 

We conclude that for proof term construction of (0), we only make use of 
the proof terms resulting from (1&4) and (5). We do not make use of the proof 



1 Silently, we introduce binary operator fl to our regular hedge language, where fl 
denotes the intersection of two regular hedges. We have that b Ai < (A 2 fl A 3 ) iff 
b Ri < i? 2 A b Ai < A 3 . 
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terms resulting from (2) and (3). However, the fact that their proofs exist allows 
us to conclude that down-casting never fails. 

The above “trick” works in general. Consider the general case 

hnf (l[S],T) < (l[R 1 ],R' 1 )\.,.\(l[R n ],R' n ) (0) 

According to Hosoya’s Law, (0) holds iff 

I- (l[S],T) <r\MieiRJ,r)\l[r]A ieI R^ 
where I C [1, . . . , n], I = [1, . . . , n]\I 

where r = S'|T|i?i|...|i?„|i? , 1 |...|i? , n . Each instance of / is an element of the 
powerset P([l, . . . ,n]). Hence in general, there are 2" instances of I. Following 
our reasoning in Example 8, we can rephrase rule (m-cl) in Figure 2 as follows. 

h = h r = S\T\R 1 \...\R n 

h (l[S],T) < (l[R 1 \...\R n ],(R> 1 \...\R' n )) (1&2") 

h (/[5],T)<a[i? 1 ],r)|(/[r],(^|...K)) (2) 

h (l[S\,T) < a[P 1 |...|P„_ 1 ],r)|G[r],P'„) (2" - 1) 

h (l[R 1 ],R 1 )\...\(l[RnlRn) < (l[Ri\...\R n }AR'i\-\K)) (Taut) 

h„f(Zi[S],T) < (l[R 1 ],R' 1 )\...\(l[R n ],R' n ) 

To implement the rephrased rule (m-cl) we introduce a new type class to 
to compute the rhs of (1&2™) (the “key”), and the sequence of rlrs of (2), ..., 
(2™ - 1) (the “guards”). 

data KAG k gs = KAG k gs 

— k represents the key and gs represents the list of guards 
class CLUtoKAG clu kag I clu -> kag where clutokag : : clu -> kag 

Here is the encoding rule (m-cl). 

— ( (1 [rl] ,rl ’ ) I rs) is the CLU. 

instance ( REtoHT gs hgs, REtoHT k h3, REtoHT (l[s],t) (h,h’) 

, CLUtoKAG ( (1 [rl] ,rl ’ ) I rs) (KAG k gs) — (LI) 

, UCast (l[s],t) k (h,h’ ) h3 — (L2) 

, UCasts (l[s],t) gs (h,hd) hgs — (L3) 

, DCast k ( (1 [rl] , rl ’ ) I rs) h3 (Dr (hi, hi’) hs)) — (L4) 

=> NUCast (l[s],t) ( (1 [rl] , rl ’ ) I rs) (h,h ; ) (Or (hi, hi’) hs) where 
nucast (1 [s] ,t) ( (1 [rl] , rl ’ ) I rs) (h,h’) = 

let (KAG k gs) = clutokag ( (1 [rl] ,rl 1 ) I rs) — (L5) 

h3 = ucast (l[s],t) k (h,hl ; ) 
in case (dcast k ( (1 [rl] ,rl ’ ) | rs) h3) of — (L6) 

Just hr -> hr 

Nothing -> error "This will never happen." 
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At location (LI) we compute the key and the list of guards. Note that the 
key value k will be requested in the instance body (see locations (L5) and 
(L6)) whereas the guards are only present on the level of types. At location 
(L2) we verify assumption (1&2"). At location (L3) we verify assumptions (2), 
...,(2 n — 1). Type class UCasts is a variant of UCast, allowing us to verify a set of 
assumptions. At location (L4) we verify assumption (Taut). The instance body 
should not contain any more surprises. 

The remaining rules in Figure 2 can be straightforwardly encoded in terms 
of type classes. Note that we need to introduce type classes Norm’ and Clu to 
model functions norm’ and clu. Full details can be found in [15]. 

Theorem 4. We can fully and faithfully encode Antimirov’s algorithm extended 
to deal with regular hedges in terms of type classes. Additionally, we are able to 
construct proof terms representing the appropriate casting functions. 



5 Related Work and Conclusion 

Our use of type classes to model subtyping among regular expressions appears 
to be novel. To deal with regular hedges we make use of a novel adaption of 
Antimirov’s algorithm. Such an extension of Antimirov’s algorithm has already 
been considered by Kempa and Linnemann [13]. They also exploit Antimirov’s 
algorithm for containment checking. But, they do not consider how to derive 
proof terms to represent casting functions. As we have seen this becomes non- 
trivial in case of regular hedges. 

Thiemann [21] makes use of type classes to check for correctness of con- 
structed XML values in the context of the combinator library WASH. However, 
he does not consider destruction of values, i.e. how to generate the appropriate 
casting functions. 

Crary [4] implements a calculus with higher-order subtyping and subkind- 
ing by replacing uses of implicit subsumption with explicit coercions. Coercion 
functions are derived out of the proof from subkinding judgments. At this stage 
we do not know whether his proof system can be adapted to the setting of reg- 
ular expression types, nor whether type classes can be used to encode his proof 
system. We plan to pursue these questions in the future. 

Frisch and Cardelli [6] introduce a linear-time regular pattern matching al- 
gorithm over structured run-time values to avoid the exponential explosion in 
case of the “local longest match” policy. We conjecture that such problems can 
be circumvented altogether in our implementation of semantic subtyping based 
on Antimirov’s algorithm. We refer to [15] for details. 

We see the current work as the first step in adding type-safe XML sup- 
port to languages such as Haskell. We are near to completion in the design of 
XHaskell [14], a variant of Haskell to support regular expression types, regular 
expression pattern matching and semantic subtyping. Our implementation of se- 
mantic subtyping in terms of type classes is crucial to achieve this goal. XHaskell 
is translated to Haskell by a source-to-source transformation. Then, XDuce-style 
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type checking and insertion of casting functions is performed by type class res- 
olution. The “alternative instance” extension is currently not available in any 
known Haskell extension. In fact, such an extension is straightforward given the 
close connection between type class programming and logic programming [16,18] 
We plan to incorporate alternative instances into the Chameleon system [20] 
our own experimental version of Haskell. An important issue will be how to map 
type class errors back to meaningful type error messages in XHaskell. We expect 
that we can rely on our own work on type debugging [19]. We plan to pursue 
this topic in future work. We already conducted some initial experiments which 
show that XHaskell’s run-time performance is competitive in comparison with 
XDuce and CDuce. We intend to provide a more detailed comparison in the 
future. To enable full XML support we yet need to add recursive types to our 
type language. We refer the interested reader to [15] for more details. 
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Abstract. We propose a new implementation scheme for XML transformation 
languages through derivation of stream processors. Most of XML transformation 
languages are implemented as tree manipulation, where input XML trees are com- 
pletely stored in memory. It leads to inefficient memory usage in particular when 
we apply a facile transformation to large-sized inputs. In contrast, XML stream 
processing can minimize memory usage and execution time since it begins to out- 
put the transformation result before reading the whole input. However, it is much 
harder to program XML stream processors than to specify tree manipulations be- 
cause stream processing frequently requires ‘stateful programming’. This paper 
proposes an implementation scheme for XML transformation languages, in which 
we can define an XML transformation as tree manipulation and also we can obtain 
an XML stream processor automatically. The implementation scheme employs a 
framework of a composition of attribute grammars. 



1 Introduction 

In recent years, various languages specialized for XML transformation have been pro- 
posed [25, 28, 10, 3], Since XML documents have tree structure, these languages support 
various functions of pattern matching for paths in order to access particular nodes. These 
node accessing methods are generally implemented as tree manipulation that requires the 
whole tree structure of an input XML document to be stored in memory. This implemen- 
tation might be inefficient in memory usage in particular when a facile transformation, 
such as tag renaming and element filtering, is applied to large-sized XML documents 
because it does not require all information about the tree structure. 

XML stream processing has been employed as one of solutions to reduce memory 
usage[22, 5]. XML stream processing completes a transformation by storing no tree 
structure of XML documents in memory. While taking advantage in memory usage, 
XML stream processing has a problem that programs are quite complicated because 
XML stream processing is defined by ‘stateful program’ in the sense that programmers 
need to consider what to memorize on reading a start tag, what to output on reading 
an end tag, and so on. It imposes a burden on programmers and causes error-prone 
unreadable programs. 

To release programmers from the bother of stream processing, two kinds of ap- 
proaches have been proposed. The first approach is to add various primitive combi- 
nators or functions for stream processing^ 1, 27]. Though it helps us to make stream 
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(I) Converting into an ATT 

(II) Composing with an ATTp flrac 

(III) Stream processing style evaluation 



Fig. 1 . Derivation of stream processing programs 



processing programs, we still need to write programs with the prospect that the input 
XML is not a tree but a stream. The second approach is to give a mechanism deriv- 
ing a stream processing program from a specification of a tree manipulation or node 
accessing[l, 17, 19,4, 15]. We can write the program in tree manipulation style and need 
not to bear in mind that the input XML is a stream. However, the derivation mechanism 
can be applied to only a restricted class of XML transformations. 

This paper basically takes the second approach, provided that the derivation method 
can deal with a wider class of XML transformations than before. Our derivation mecha- 
nism is based on a framework of attributed tree transducer{ ATT) [6] which is a formal 
computational model for attribute grammar(AG) [12]. The mechanism is summarized 
by dividing it into three steps (Fig. 1). In the first step (I), a given XML transformation is 
converted into an ATT specialized for XML transformation, called XatT - ' 4 . The Xatt*~ s 
represents a transformation from XML trees to XML streams. This paper roughly shows 
how to convert an XML transformation into XatL~' s through two kinds of practical 
XML transformation: XPath querying [24] and partial modification. The former returns 
the collection of all elements specified by a given XPath expression. It is useful in the 
case that the input XML is regarded as a database. The latter returns an XML with a 
similar structure to the input XML where several fragments in the input are replaced with 
different ones. We can write a variety of XML transformations by combination of these 
basic transformations, though they are only a subset of the class of XML transformation 
specified by Xatt ( ~ s . However, it does not show the limitation of this approach. This pa- 
per claims that we can derive an XML stream processor from every XML transformation 
program defined by Xatt*^ s which can deal with a wide class of XML transformation 
as well as non-circular AG. If one wants to implement an XML transformation lan- 
guage intended for stream processing, all he has to do is to give a conversion method 
into Xatt*~ s for the language. The stream processor is derived from an Xatt ,—S by the 
following two steps. 

In the second step (II), we can obtain an Xatt s ~ s , which represents a transformation 
from XML streams to XML streams, from an Xatt* ’ s . We employ a descriptional com- 
position [8], which is a composition method for AGs. If two AGs are given, where one 
represents a transformation from a language La to a language L /> and another represents 
a transformation from a language Lb to a language Lc, then we can obtain a single AG 
from La to L ( < by descriptional composition. The obtained AG does not create any expres- 
sions in Lb even as intermediate result. The method can also be applied to ATT [14, 16]. 
In our framework, we use this method to compose a given Xaff t— with the other ATT rep- 
resenting a parsing function for XML and obtain an XatL >s . When more appropriate, the 
original descriptional composition cannot be applied to the parsing ATT because of the re- 
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striction of the method. For this reason, we use an extended descriptional composition[ 1 6] 
which can deal with ATTs with stack devices required in the parsing ATT. 

In the third step (III), from the XatL~ s derived in the previous step, we can obtain 
an XML stream processor by specifying evaluation order for attributes in stream pro- 
cessing style. The XML stream processor can be derived by dividing the computation by 
Xatt s— into the computation for each input symbol. This generation method is similar 
to Nakano and Nishimura’s method in [17, 19]. However, they failed to deal with a large 
class of ATTs because they sticked to derivation of finite state transition machines with 
dependency analysis. 

This paper also shows benchmark results to illustrate effectiveness of our framework. 
The author has implemented an XML transformation language XTiSP (an abbreviation 
for XML Transformation language intended for Stream Processing) based on our frame- 
work. Using XTiSP, we can program an XML transformation as tree manipulation with 
XPath and obtain the corresponding XML stream processor for free. We compare an 
implementation of XTiSP with two XSLT processors, Xalan [26] and SAXON [23]. 

Related Work 

The issue of automatic generation of stream processors has been studied for various 
languages. Most of these studies, however, targeted for only query languages such as 
XPath[ 1 , 4, 9] and a subset of XQuery [15]. These querying languages have few express- 
ibility to specify XML transformation. For example, they could not define the structure- 
preserved transformation, such as renaming a tag name a into b. Our framework allows 
not only querying functions but also ability to specify such transformation. 

Our framework is based on the previous work [17, 19]. They succeeds in deriving a 
stream processing program from the specification of XML tree transformation defined 
by an ATT. However, in their framework, a set of ATT s we can deal with and input XMLs 
for them are quite restricted because of the weak applicability of AG composition method 
[8]. Our framework solves this problem by using the extended composition method [16]. 

Additionally, there are several work using a framework of attribute grammars for 
XML transformation. Whereas we use binary representation for XML trees, [18,2] 
improve a traditional attribute grammar for unranked representation. However, their 
attribute grammars do not have a framework of descriptional composition we require. 
Although [13] employs an attribute grammar for XML stream processing, they uses only 
L-attribute grammars. In our framework, attribute grammars for XML stream processing 
are non-circular, which is wider class than L-attribute, and they are automatically derived 
from tree manipul ation programs. 

Outline 

The rest of this paper is comprised of seven sections, including introduction. Section 2 
gives basic notations and a brief introduction of attributed tree transducers. In Section 3, 
we show how to specify XML transformation by using Xatt t ~ s . Section 4 presents a 
composition method of Xatt*^ s and an XML parser. In Section 5, we mention how to 
obtain an XML stream processor from Xatt s ^ s . Then Section 6 shows several benchmark 
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results to illustrate effectiveness of our framework. Finally we conclude this paper in 
Section 7. 

2 Preliminaries 

2.1 Basic Notions 

The empty set is denoted by 0. The set of natural numbers including 0 by N and the set 
of natural numbers excluding 0 by N+. For every n £ N, the set {1,. . ,,nj is denoted 
by [n]. In particular, [0] = 0. We denote a set of finite strings over a set P of symbols by 
P* . A null string is denoted by s. 

A ranked alphabet £ is a finite set in which every symbol is associated with a non- 
negative integer called rank. We denote the rank of a symbol a by rank(a). We may 
write to indicate that rank(a) = n. Let £ be a ranked alphabet and A be a set of 
variables disjoint with £. The set of Y,-labeled trees indexed by A, denoted by T S (A) 
(or T s , if A is empty), is the smallest superset T of A such that cr(fi, • • • ,t n ) £ T for 
every € £ and t\, ■ ■ ■ ,t n £ T. 

We denote by t\x := s] the substitution of occurrences of a variable x by s. Let 
t, Si , . . . , s n ,ui, . . . ,u n be trees inT s (X) such that every u; for i £ [n]+ is a subtree of t, 
provided that U{ is not a subtree of Uj for any i and j with i f j. The tree t[u \ , . . . , u n := 
Si,...,s„], or t[ui := Si]ie[ n ], is obtained from t by simultaneously replacing every 
subtree at the points of occurrences of u\ , . . . , u n by the trees si,...,s n , respectively. If 
p = [«i, . . . := si, . . .], then we may write p(f) for t[u\, ...:= s\,.. .]. 

The prefix-closed set of all paths of t, denoted by path(t)( C Nl), is defined by 
path(a(ti,-- ■ , tk)) = {e}U{iu; \ i £ [k\,w £ path{U)} if £ £. We write t\ w for a 
subtree of a tree £ at a path w £ path(t). Every path w £ path(t) refers to a corresponding 
label of t, denoted by label(t,u>), which is defined by label(a(ti, ■ ■ ■ ,tn),z) = a an d 
label(<i(ti,--- ,tn)A w ) = label(ti,w) for every i£ [n] and w £path(ti). 

A reduction system is a system (A,=>) where A is a set and => is a binary relation 
over A. We write m => n a n+ \ if a, => o* + i(i £ [n]) for some ,a n+ 1 £ A. In 

particular, a =>° a. a £ A is irreducible with respect to if there is no c £ A such 
that b => c(^ b). If b is irreducible where a b for some igN and there is no pair of 
irreducible term b'(yf b) and i' £ N such that a =>* b ' , we say b is a normal form of a 
and write «/(=>, a) for b. 

2.2 Attributed Tree Transducers 

We give a brief introduction of attributed tree transducer(ATT) which has been intro- 
duced by Fiilop [6] as a formal computational model of attribute grammar(AG). See [7] 
for detail. The ATT M is defined by a tuple M = (Syn. //;/;.£. A. s,-„, jj, R), where 

- Syn is a set of synthesized attributes. Inh is a set of inherited attributes. We denote 

| s £ Syn,k £ [n] } and {i(n) \ i £ Inh} by n sv „(n) and IT,-,,;, , respectively. 

- £ and A are ranked alphabets, called the input and output alphabet, respectively. 

- Si„ is the initial (synthesized) attribute and jj is the root symbol with rank 1. These 
are used for specifying the computation result. 

- R is a set of attribute rules such that R = U CTesu m R a with finite sets of a-rules 
satisfying the following conditions: 
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• For every s £ Syn and a £ £, R G contains exactly one attribute rule of the form 
of s(it) — > r| where rj £ T E (IT s;yn (ranA:(a)) U II, ■„/,). 

• For every i £ Inh and a £ X, R 1 contains exactly one attribute rule of the form 
of i(nk) — > r) where k £ [rank (a)] and rj £ X E ( TT S> ,„ ( rank ( a) ) U II, ■„/,). 

• contains exactly one attribute rule of the form of s,„( n) — » rj where r| £ 

r E (n sy „(i)). 

• For every i £ Inh , contains exactly one attribute rule of the form of i(nl) — > r| 

where r\ £ T S (II. SV „(1)). 

where we use ji, jtI, tc2, • • • for path variables. 

The computation by M for input tree t £ T E is defined by a reduction system 
(U,=>Mt(t)) where U = T&({a(w) \ a £ SynUInh,w £path(t)} and => Mi is defined 
by 



- s(w) =>M,t tfl 71 := w] where s £ Syn, s(n) — > rj £ R a and ct = label(t,w). 

- i(wk ) =>M,t Ti[7r := w\ where i £ Inh, s(nk) — » rj £ R° and a = label(t,w). 

- 8(..., => M ,t 8(..-»-jT| , fc ,...) where 8 € A and r\ k => M ,t Ffc- 

where [it := w) with w £ N* stands for a substitution [/(7i),i(Ttl),5'(7t2), • • • := i(w), 
s (wl) , s(w 2) ,•••],-£ inh.seSyn ■ The transformation result by M for the input tree t is defined 
by n f m , t(t) i s i'n(s))(€ 7 a). 

3 Attributed Tree Transducers for XML Transformation 

Our framework requires XML transformations to be defined by specialized ATT s in order 
to obtain the corresponding XML stream processor. In this section, we first introduce a 
model of XML trees and XML streams as annotated trees. Next we extend ATTs to ones 
that deal with annotated trees, called attributed tree transducers for XML(X att t_>s ). Then 
we show several examples of Xatt*^ s which represent basic XML transformations. 

For simplicity, we deal with XML documents with no character data and no attribute. 
Our framework can be easily extended with them and the actual implementation supports 
them. In the rest of paper, a bare word ‘attribute’ is not used for XML attributes but for 
synthesized/inherited attributes in ATTs. 

3.1 XML Trees and XML Streams 

We use a binary representation for XML trees in which each left branch points to the 
leftmost child and each right branch to the first sibling node to follow in the tree 
structure of XML. For example, consider a fragment of XML 
<a><b/><c/><d/x/a>. It has tree structure as shown 
in the upper-right figure. In binary representation, the XML 
tree is figured as shown in the lower-right one where a black 
bullet stands for a leaf which means the end of siblings. Every 
tree in binary representation is defined by a tree over X tree = 

{ N ( 2 ) , L ( T } where N is a binary symbol annotated with a label 
and L is a nullary symbol corresponding a leaf. For instance, 
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we write l\l a (N b (L, l\l c (L, l\l d (L, L))), L) for the XML tree above. The set of trees in 
binary representation is denoted by T S(TO . 

XML streams are defined by monadic trees over Y, stream = {S^, E^.Z 10 )} where 
S and E are unary symbols corresponding a start tag and an end tag, respectively, each 
of them is annotated with a label as well as N, and Z is a nullary symbol standing for 
the end of stream. For instance, we write S a (S b (E b (S c (E c (E a (Z)))))) for a fragment of 
XML <a><b/><c/xa>. The set of streams is denoted by T Sj(reom . 

3.2 Extension of Attributed Tree Transducers with Annotation 

Xatt*~ s deals with annotated trees such as XML trees and XML streams as ATT deals 
with trees. An Xatt*^ s is defined by a tuple of M = (Syn,Inh,Y,A,R) 1 in similar way to 
an ATT, where X = Ytree and A = Y stream since an Xatt*~ s represents a transformation 
from XML trees to XML streams. The major difference from ATT is that R is defined 
by U with finite sets R® x of a x -rules in which we can use the annotation x 
in a right-hand side of an attribute rule. For example, R Nx may contain an attribute rule 
s(7t) — > Sa;(s(7il)). We show a simple example of XatT^ s which represents an identity 
XML transformation that takes an XML tree and returns the corresponding XML stream. 
The Xatt*^ s M i( j = ( Synjnh , X, A, R ) is defined by 

- Syn = {$} and Inh = {/} 

- R = U R Nx U R l where 

• R i = {sm(tt) — ► s(rcl), i(nl) — > Z} 

• r n * = { s (k) — > S x (s(7tl)), i(jtl) -> E x (i(7t2)), i(j t2) — > i(rc)} 

• R l = {s(n) — > i(n)}. 

Letting t = N a (N b (L, N a (L, L) ) , L) be a given input XML tree and => stand 
for the transformation result of M,>/ is computed by 

Sin(e) => s( 1) => S a (s(ll)) => S a (S b ( 5 (lll))) => S a (S b (i(lll))) 

=► S a (S b (E b (s(112)))) =► S a (S b (E b (S a (j(1121))))) =► S a (S b (E b (S a (i(1121))))) 
=► S a (S b (E b (S a (E a (s(1122)))))) =► S a (S b (E b (S a (E a (/(1122)))))) 

=► S a (S b (E b (S a (E a (i(112)))))) =► S a (S b (E b (S a (E a (/(ll)))))) 

=► S a (S b (E b (S a (E a (E a (s(12) )))))) => S a (S b (E b (S a (E a (E a (i(12) )))))) 

=* S a (S b (E b (S a (E a (E a (i(l))))))) =► S a (S b (E b (S a (E a (E a (Z) ))))). 

Fig. 2 visualizes the above computation. It shows the input XML tree t annotated with 
attributes and their dependencies. For example, f ( 1 1 2 1 ) is computed by E a (s(1122)) 
because of the attribute rule (z'(jtl) — » E x (j(ji2))) £ R Nx with n = 112 and x = a from 
label{$(t), 112) = N a . 

The XatT~ ,s transforms from XML trees into the corresponding XML stream. 
Using two attributes and i, we can make an evaluation in depth-first right-to-left order. 
Note that we do not directly use this Xatt t— for stream processing. Since we use an 
ATT obtained by the composition of the original Xatt / ~ ,s and parsing transformation, 



1 The initial attribute and the root symbol [t are omitted for simplicity. 
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Fig. 2. The computation by M id for the input N a (N b (L, N a (L, L)), L) 

the above evaluation does not mean that stream processing cannot start to output any 
result before it reads the complete input. 

Xatt*~ s can deal with symbols other than output symbols in Y, s tream in right-hand 
sides of the attribute rules. Let us consider an Xatt t—S ML has the same definition as 
M ic [ except that Er x is a set of the following rules: 

s(7t) — > Cat(S x ,i(7tl)), t'(Ttl) — » Cat(E x ,5'(7t2)), /(ji2) — » i(n). 

The computation by ML is done in similar to that by M id . For example, if outputs 

S a (S b (E b (. . .))) for an input, M' jd outputs Cat(S a , Cat(S b , Cat(E b , ...))) for the same 
input. The symbols S, : and E , are used as nullary symbols and the binary symbol Cat 
means a concatenation of outputs and In the rest of paper we use these symbols instead 
of unary symbols 5 X and E , for output symbols. It will be helpful for us to obtain XML 
stream processors. 

Furthermore, our framework allows the output alphabet of Xatt* >s to include the 
other symbols such as Str T . Eq r , And, Or and If: the nullary symbol Str, means a 
string value x\ the unary symbol Eq z . means a boolean value representing whether the 
argument matches a string value x or not; the binary symbols And and Or represent 
boolean operations in usual way; the 3-ary symbol If means a conditional branch by 
the first argument. These symbols are useful for us to define a wide variety of XML 
transformations. 

3.3 Conversion from Practical XML Transformations to Xatt , —S 

This section gives practical examples and several hints for converting XML transforma- 
tion into Xatt* - 5 . A basic idea of the conversion is that we give a number of pairs of 
synthesized attribute s and inherited attributes i whose attribute rules for N. /: have the 
following form: 

s(n) \f(etest,e s i,e s 2), i(nl) -> lf(e tert ,e,7,ea), i(n2) — *■ i(n) (1) 

where e test , e s i, e S 2 , en and ea are right-hand side expressions for Xatt f ~ s . By adjusting 
these expressions we can convert a variety of XML transformations into Xatt*^ s . For 
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example, let us consider an Xatt*^ s Mju which has the same definition as M\ d except 
that R Nx has attribute rules of the form (1) where 

e,est = Eq x (Str b ), e sl = s{n2), e s2 = Cat(S x ,s(7tl)), eu = e i2 = Cat(E x (.y(7t2))) 2 . 

The Xatt t— ' s Mm gives a transformation which outputs the input tree with no b 
element, which can be figured out as follows. If x ^ b, then R Nx of Mm equals to that of 
M' id because we take the second branch Cat(Sa;,s(jrl)) for If. Suppose that x = b. Since 
we take the first branch j(ji 2), the dotted arrow which points to 5 (H) in Fig. 2 comes 
from i(112) instead of s(lll). It implies that all nodes under the first branch of N b is 
skipped in the evaluation. Therefore we get the transformation result which is the input 
tree with no b element. In the following, we show how to convert XML transformations 
to Xatt*^ s for two practical examples, XPath querying and partial modification. Each 
transformation is defined by a pair of synthesized and inherited attributes. A number 
of XML transformation programs can be regarded as the combination of these basic 
transformations. 

XPath Querying and Its Extension. XPath [24] is a syntax for defining parts of an XML 
tree by using paths on the XML tree. For example, /child: : a /descendant : :b 
[ child : : c ] designates b elements each of which is descendant of an a node and has a 
c node as its child. XPath querying collects all elements satisfying the XPath expression. 

A part of XPath querying can be converted into Xatt / ~ s . We consider querying by 
absolute XPath expressions in which only forward axes are used, i.e. . the expression 
cannot contain backward axes such as parent and ancestor. Note that this restriction 
is not too excessive. All backward axes in absolute XPath can be removed by the method 
introduced by Olteanu et al. [21]. 

XPath querying is converted into Xatt f ~ s M que which contains conditional branches 
of the form (1 ) where 

e s] = Cat(S x ,5(7tl)), e S 2 = s(nl), eu = Cat(E x ,5(n2)), e,- 2 = 5 (ti2). 

In the computation by M que , the node is copied if e test is evaluated to true at the node. 
Otherwise, no copy is created for the node. Thus we need to specify how to give e test for 
each XPath expression in order to convert the XPath querying into Xatt*^ s3 . 

The conversion is defined by associating all subexpressions of a given XPath expres- 
sion. with a synthesized or inherited attribute in an XatF^ 5 . Subexpressions in brack- 
eted qualifiers (also called predicates) in the XPath are associated with synthesized 
attributes. The other subexpressions are related with inherited attributes. Consider an 

ul 

absolute XPath expression /child : : a/descendant : :b [child: : c]. We take 

V V ' v v ' 

u2 vl 

three subexpressions as shown by curly braces each of which is associated with an 



~ The conditional branch If is useless for i(7tl)-rule because eu = eu- 
2 ..... 
This conversion assumes that the node whose ancestor is queried is not queried by a given 

XPath expression. In order to query such nested nodes, the other conversion is required. 
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attribute assigned to the brace. The attributes ul and u2 are inherited and vl is synthe- 
sized. To complete the conversion, we use one more attribute u3 to propagate information 
about whether the node is a descendant of the node satisfying the XPath expression. The 
following attribute rules define the relation of the attributes: 

B} = {z/7 (ttl) — > True, u2( 7tl) — > False, k5(ti1) — > False, . . .} 

R Nx = {v7(jt) — > Or(Eq a ,(Str c ),v7(tt2)), m7(ti1) — > False, ul(n2) — » ul(n), 
u2(nl) — * Or(u2(n), And(n7(ix), Eq x (Str a ))), u2(n2) —>u2(n), 
u3( Ttl) — > Or(w3(jt), And(w2(7t), And(Eq x (Str b ),v7(7tl)))), 
u3(n2) — >, Or(Mj(tt), And(M2(7t), And(Eq x (Str b ),v7(7tl)))), ...} 

7? l = {v7(tt) — > False, . . .}. 

where several attribute rules are omitted. We use the same attribute rules as 7? L and 77* of 
M u j for attributes s m , s and i. The value of an attribute ul represents whether the node is 
the child of the root; The value of u2 represents whether the node is the descendant of an 
a node which is the child of the root; The value of vl represents whether either the node 
or one of the following sibling node is a c node. An XatT~' s representing the intended 
XPath querying is defined by M que with e test = Or(«i(7t), And(w2(7t), And(Eq a ,(Str b ), 
v7(7tl)))). The expression e tes t , which equals to the right-hand side of the attribute 
rules for u3, represents whether either the node or one of its ancestors satisfies the 
XPath. 

Partial Modification. Whereas XPath querying leaves the designate elements and strips 
their contexts, partial modification leaves the context of the the designate elements 
and replaces the elements with the other elements. The partial modification is con- 
verted into an Xatt* - * s in similar way to XPath querying. Let us consider an Xatt*~ s 
M mo d which has the same definition as ML except that R Nx is a set of the following 
rules: 

e s i = Cat(S x ,Cat(E I ,,j(7t2))), e s2 = Cat(S cc ,^(nl)), eu = e i2 = E x (s(7t2)). 

where e test is a certain expression for specifying the designate node. The element is 
replaced with a no-child element if e tes t is evaluated to true at the node. Otherwise, the 
node does not change since the attribute rules equals to that of ML. 

Let us see the other example of partial modification. Consider an Xatt f ~® M' mod 
which has the same definition as M mo< j except that the first two rules of R Nx is as 
follows: 



e s i = Cat(S a ,s(7tl)), e s2 = Cat(S a; ,s(7tl)), e y = Cat(E a ,s(jc2)), e i2 = Cat(E x ,s(7t2)). 

The name of the node changes into a if e tes t is evaluated to true at the node. Otherwise, 
the node does not changed. This procedure is applied to every node of the input XML. 
Now we show an example of Xatt*~ s Mjks that plays a role of partial modification. The 
attribute rules of Mjks are comprised of 
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R 8 = {SinW ->■ s(rtl), l'(jtl) -> Z} 

R N x = {v7(rc) -> Or(Eq a ,(Str k ), v7(tt2)), 

■y(it) -»■ Cat(lf(And(Eq a .(Str J ),v7(itl)),S s ,S a: ),s(jrl)), 
t(7tl) -» Cat(lf(And(Eq a .(Str J ),vi(7il)),E s ,E a; ),s(jc2)), 
i(n2) — + ?(tt)} 

R l = {s(7t) — > vi(jt) — > False} 

where we use Cat(lf(e to ,,ei,e 2 ),ei) instead of lf(e tof ,Cat(e/,e 2 ),Cat(e 2 ,ei)) to ob- 
tain efficient stream processors. The name of every node changes into S by M only if 
the node satisfies an XPath expression /descendant-or-self : :J[child: :k] 
(= / / J [ k] ), i.e. , M jks changes the name of the J node having a k node as its child into 
S. For example, an XML <h><Ixk/x/I>< jxk/x/ J>< J><c/></ Jx/h> is 
transformed into <hxIxk/x/IxSxk/x/SxJxc/x/ Jx/h> by Mj^g. 
It uses no inherited attribute associated with the subexpression /descendant-or- 
self : : since the expression should be satisfied at any node. 

Combination of Basic Transformations. The conversion of basic transformations is 
easily extended by the other XML transformations. For instance, consider an XML trans- 
formation T com b which returns the collection of results of partial modification only for 
nodes specified by an XPath. We need two pairs of synthesized and inherited attributes: 
one pair (si,i\) is used for a partial modification; another pair (s 2 ,h) is used for col- 
lecting results for each nodes specified by XPath. The XML transformation T com j, is 
defined by Xatt t— with these attributes. The conversion of combination of basic trans- 
formation into Xatt* - ^ has been automatically done in the implementation of an XML 
transformation language XTiSP introduced in Appendix B (see also [29]). 

4 Composition with an XML Parsing Attributed Tree Transducer 

This section introduces a method for obtaining an ATT which represents a transformation 
from XML streams to XML streams, denoted by XatL~ ,s . An Xatt* - in the previous 
section represents a transformation from XML trees to XML streams. In order to derive 
Xatt’ —S , we compose the Xatt*~ s with an XML-parsing ATT M parse to synthesize a 
single ATT, where M parse represents a transformation from XML streams to trees. The 
composition employs a composition method for stack-attributed tree transduce rs(S ATT) 
[16] because the parsing ATT requires a stack device which was harmful for the original 
composition method [8, 14]. Since the composition method is rather involved, we intro- 
duce the method specialized for the case of the composition an Xatt*^ s and the parsing 
ATT M parse whose definition is presented in Appendix A. By applying the composition 
method to a given Xatt*^ s and M parse , the following composition method is obtained. 
Although the composition method does not deal with annotations of nodes, we can easily 
make extension of the method for them. 

Let M = (Syn,Inh,Y,treei A, R) be an Xatt*^ s where A includes primitive 
functions such as Cat, S x , True, and so on. The corresponding XatL~ s is defined by 
M = (Syn 1 ,lnh' , T: S t re a m . A' R') where 
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- Syn' = {(s ,/ ,,s} | s' G Syn, s G {p,/}}, Inh! = {{i' ,s) \ i' G Inh , s G {/?, /}} 

- A' = AC {Head (1) ,Tail {1) ,Cons {2 \Nil (0) }, 

- and R' = R'$ u R' Sx U R' Ex U R' z with 

-R' # = { Sin ->■ <P['* , (” 1 ) ; = ,P){^)] S ’&Syn I (s/„(jt) ->■ (p) G i? # } 

u {(/',/?} -»■ cp[5 / (7il) := (/,p)(ttl)]. s / 6Sy „ I (i'(7tl) -» cp) G R s } 

U {{i' ,1) —> Nil | i r G Inh} 

R ,Sx = {{s',p){ n) -> p(<p) | (/(n) -> (p) G R Nx ,s' G Syn } 

U {(/,Z)(n) — > Tai/((s',l)(nl)) | s' G Syn} 

U {{i',p}{ ttl) — > p(cp) | (* / (tc 1) — > (p) G R Nx ,i' G Inh} 

U {(i',l)(n 1) — > Cons(p((p),(i , ,/)( 7t )) I 9) G -R N “’,i / G Inh} 

R' Ex = {{s',p}{ n) -> cp[i'(7r) := {i' ,p){n)] v&nh | ( s'(n ) -> cp) G R L ,s' G Syn} 

U {(s / ,Z)(tc) — > Cons((/,p)(7il),(/,/)(7tl)) | s' G Syn} 

U {(i , ,/?)(7tl) — > Head((i' ,l)(n)) \ i' G Inh} 

U {(Z , ,Z)( 7 tl) — > Tail((i' ,/)(jc)) | t 7 G /n/t} 

R ' Z = WiP)^) ->• #'(*:) := ■/£/„,, | (/(tt) — » cp) G i? L y G Syn} 

u {(/,z)(tt) — > m/ | s' g syn} 

where p= [/(7tl),/(Tt2 ), /'(ti) := (s' ,p)(nl),Head({s' ,p)(n)] s/eSyn / e i nh . 

The added output symbols Head, Tail, Cons and Nil are caused by a stack operator in 
M parse- Each of them has a meaning with respect to stack operation: Head(e) represents 
the top-most element of a stack e; Tail(e) represents a stack e whose top-most element 
is removed; Cons{e i,e 2 ) represents a stack obtained by pushing a value e\ to a stack 
t' 2 ; Nil represents an empty stack. 

For example, we obtain Xatt s ^ s M' Jks from the definition of the Xatt*^ 5 Mj^s by 
the above method. The attribute rules of M' Jks are comprised of 

f?' # = (s,„(Tt) -+ (s,p) (ttl) , (i,p)( 7tl) -> Z, <i,/)(ttl) >Nil}, 

R' Sx = {(s,p)(i t) Cat(lf(And(Eq a .(Str J ),(vi,p)(jtl)),S s ,S x ),(s,p)(7tl)), 

(s,l)(n)^>Tail{{s,l)(nl)), 

(i,p)(l tl) Cat(lf(And(Eq x (Str J ),{v7,p)(7tl)), E S ,E X ) ,Head({s,l) (nl))), 

(i,l)(nl) -► Cons((i,p)(n),{i,l)(n)), 

(vl,p)(n) -*• Or(Eq a .(Str k ),/Zead((vi,/)(7tl))), (v7,Z)(rc) -> 7hi7((vi,Z)(jtl))}, 

i?' E * = {(s,p)(jt) -> <i,p)(7t), (s,Z)(7t) -+ Cons((s,p)(jtl),(s,Z)(jtl)), 

(i,p)(7tl) — ► Head({i,l)(n)), ( i,l}(nl ) — > Tail({i,l)(n)), 

(vl,p)(n) -* False, {vl,l){n) -> Co«.s((v7,p)(jtl), (v7,/)(7il)}, 

Z?' Z = {(5,p)(tt) -> (i,p)(jt), (s,l)(n)->Nil, ( vl,p)(n ) -> False, (v 1,1) {it) -> Nil}. 
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5 Attribute Evaluation in Stream Processing Style 

We show a method deriving an XML stream processor from an Xatt s ~ s . The main idea 
of the method is that attribute values the Xatt s—s are evaluated in stream processing 
style. 

In the attribute evaluation for Xatt s ~ s , we use extra rules for primitive functions as 
well as the reduction rules specified by the definition of ATT. The definition of Xatt*~ s 
allows to use primitive functions such as If , And and Cat. At the previous stage, they have 
been regarded as constructor symbols since the composition method for ATTs cannot be 
applied to primitive functions destroying tree structures. The attribute evaluation deals 
with them as meaningful functions to obtain the transformation result. For instance, we 
have the following rules: 

lf(True,ei,e 2 ) => ei, lf(False,ei,e 2 ) => e 2 , Eq x (Str x ) => True, 

And(True,ei) => ei, And(False,ei) => False, 

It is obvious that these rules do not conflict with the reduction rules defined by ATT. 

We use a running example of the Xatt s ^ s M' Jks to show a method for deriving an XML 
stream processor. Suppose that an input for M' Jks is t = S J (S k (E k (S 1 (E 1 (E J (Z)))))) 
corresponding to an XML <Jxk/xl/x/J>. Let =>■ stand for => M j w ,t)(t) i n the rest 
of this section. From the definition by ATT, we obtain the transformation result r by 
r = «/(=>, Sj n ( e)). An XML stream processor for M' Jks computes the normal form by 
integrating a partial result for each input symbol, Sj, S k , E k , S- , E- L , Ej and Z. 

Before any input symbol is read, we find that?’ is computed as «/(=>, (s,/?)(l)). since 
we have s,- n (e) => (s,p)(l) from the attribute rule of R'^ in M' JkS . We cannot progress the 
computation until the first symbol is read. Additionally, in preparation for the following 
computation, we evaluate two attribute values (i,p) (1) and (i, l ) (1) which may be needed 
when the next symbol is read. These values are computed into Z and Nil, respectively, 
by the attribute rules of R' : . 

When an input symbol Sj is read, we find that r is computed as «/(=>, (s,p)(l)). 
since we have 

(s,p)( 1) => Cat(lf(And(Eq J (Str J ),(v7,p)(ll)),S s ,S J ),(s,/j)(ll)) 

=> Cat(lf (And (True, (vI,p)(ll)),S s ,Sj), (i,p)(ll)) 

=* Cat(lf((vi,/j)(ll),S s ,Sj), ( s,/j)(11)) 

from the attribute rule of R'^ in M' Jks , Eq x (Str x ) => True and And(True,ei) => e\. We 
cannot progress the computation until the next symbol is read for computing the value 
of (s,p) (11) and (vl ,p) (11) . Additionally, in preparation for the following computation, 
we evaluate two attribute values (/,/>) (11) and (i,/)(ll) which may be needed when the 
next symbol is read. These values are computed as follows: 

<i»(H) =^* Cat(lf((v7,p)(ll), E s , Ej), Head ((s,l)(ll))) 

(i,/)( 11) => Cons((i,p)( l),(t,/)(l)) = Cons(Z,Nil) 

where we use the values of (i,p)( 1) and (i,/)(l) that is prepared by the last step. The 
attribute value of (i,p)( 11) is just partially computed since it requires the following input 
symbol to know the values of (vl,p)( 11) and (s,/)(ll). 




86 



K. Nakano 



When the next input symbol S k is read, we find that r is computed as n/(=> , (s,p) (1) ) . 
since we have 

Cat(lf((v7,p)(ll),S s ,Sj), (s,p)(H)) 

=> Cat(lf (Or(Eq k (Str k ),//efl<7((v7,/)(lll))), S s , Sj), (s,p)(ll)) 

=> Cat(lf(Or(True,//efl(7((vi,/)(lll))),S s ,Sj), (s,/?)(ll)) 

=> Cat(lf(True,S s ,Sj),(s,p)(ll)) => Cat(S s , (s,p)(ll)) 

=> Cat(S s ,Cat(lf(And(Eq k (Str J ),(vi,p)(lll)),S s ,S k ),(s,/7)(lll))) 

=> Cat(S s , Cat(lf (And (False, (vi,p)(lll)),S s , S k ), (s , ,p)(lll))) 

=> Cat(S s , Cat(lf(False,S s , S k ), (i,p)(lll))) 

=> Cat(S s ,Cat(S k ,(s,p)(lll))). 

Note that two Cat applications in Cat(S s , Cat(S k , . . .)) will not be modified by the 
following computation. Therefore the XML stream processor can output the string S s 
and S k , corresponding <S><k>, and the rest of result is computed by (,v./f)(lll j. The 
output is a desirable behavior for XML stream processing. The transformation by Mj^ 
replaces every J element into S only when the node has a k element as its child. Thought 
the transformation cannot return any symbol even if an input symbol <J> is read, the 
symbol <S> and its children are output once an input symbol <k> is found at the child 
position. If no <k> element is found at the child position of a J element, then the J 
element are output without changing the element name. 

The XML stream processor computes the transformation result by repeating the 
similar procedure to above for the following input. Letting j^Inh be the number of 
inherited attributes of an Xatt s ~ ,s M, the stream processor computes the fixed number 
Ink + 1 of values for each input symbol: one is used for the transformation result to be 
output afterward, the others may be needed at the next computation as values of inherited 
attributes. 

6 Experimental Results 

We have implemented our framework as an XML transformation language XTiSP[29], 
in which we can use two kinds of primitive transformations: XPath iteration and partial 
modification. See Appendix B for summary of XTiSP. The implemented system takes 
an XTiSP program and returns a stream processing program written in Objective Caml 
[20]. The system itself is also written in Objective Caml. 

We compared a program in XTiSP, which is converted into Mj^s, with the corre- 
sponding program written in XSLT [25]. We used Xalan [26] and SAXON [23] as XSLT 
processors. The comparison was done for an input XML generated randomly such that 
each tag name is I, J or k. The experiments were conducted on a PC (PowerMacintosh 
G5/Dual 2GHz, 1GB memory). We measured execution time and memory usage for sev- 
eral inputs whose size are 1MB, 2MB, 4MB and 8MB. Fig. 3(a) and Fig. 3(b) show the 
comparison results. Our implementation is much faster and much more memory-efficient 
than the others. We also tried more complicated examples of XML transformation such 
as XML database into XHTML and then our implementation won definitely. 
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Figure 3(a): Excecution time Figure 3(b): Memory usage 



I XTiSP 



\///A SAXON |: : : : : : : : :l xalan 



Fig. 3. Benchmark results 



However we cannot always benefit from automatic derivation of stream processors. 
As an extreme example, Xatt f ~ s can define a mirror transformation which reverses the 
order of child elements at every node. In this case, the program cannot output anything 
except for the start tag of the root until the end tag of the last child of the root is read 
whose next tag is the end tag of the root, that is the last token event of the input. This 
kind of transformation is not appropriate to stream processing. Although we have no 
way to find whether a given XML transformation is appropriate or not, we can easily 
add a mechanism to measure growth of stacks required by stream processing. 



7 Conclusion 

We have shown an implementation scheme for XML transformation language intended 
for stream processing. If one wants to implement an XML transformation language, all he 
has to do is to give a conversion method into Xatt*~ “ for the language. The programmer 
can obtain an efficient stream processing program without writing stateful programs like 
SAX. 

Additionally, we have implemented an XML transformation language XTiSP, which 
has its encoding method into Xatt*^ s , and have compared with other XML tree trans- 
formation languages to confirm effectiveness of our system. XTiSP works much faster 
than the other implementations, because that it can output until the whole input is read. 

Recently the author is addressing automatic generation from programs written in 
XSLT into Xatt f ~X If we automatically obtain Xatt t— * s from XSLT programs, we can 
also obtain the corresponding XML stream processor for XSLT. 
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A Stack-Attributed Tree Transducers M parse 

Stack-attributed tree transducer(SATT) is an extension of attributed tree transducer(ATT). 
See [16] for detail. The major difference from ATT is that SATT can deal with a stack de- 
vice for attribute values. Roughly speaking, SATT can simulate an ATT with an arbitrary 
number of attributes. In SATT, attributes are divided into two kinds of attributes, stack 
attributes and output attributes. Only stack attributes have a stack value which can be 
operated by Head, Tail and Cons. An XML parsing transformation M parse is an example 
of SATT. An SATT M parse is defined by 

Mparse — ( Sytl . I fill . StSyn . St lllll , S2 stream i SI tree • Sin? tt: R} 

where Syn = {p}, Inh = 0, StSyn = {/}, Stlnh = 0 and R = U R Sx U R Ex U R z with 

R 11 = {s/«(7t) — >■ p(7rl)},f? Sa; = {p{n) -> N x (p(nl),Head{l(nl))),l(n) — *■ Tail(l(nl))} 
R Ex = {p( n) — > L, /(it) — » Cons(p(ni),l(nl))}, R z = {p( n) — » L, l(n) — >M/} 

B Summary of XTiSP 

We summarize our language XTiSP[29]. A program written in XTiSP specifies a trans- 
formation from an XML to an XML. A simplified syntax of XTiSP is defined by 

exp = f ( exp , ■ ■ ■ , exp ) | <exp> [ exp ] | exp ; exp \ xpath 
| if exp then exp else exp end if 
| invite xpath do exp done 
| visit xpath do exp done 

A symbol / represents a function which takes a fixed number of strings or boolean 
values and returns a string or a boolean value. An expression <el> [e2] is used for 
constructing an element whose tag name and children are given by evaluation of el and 
e2, respectively. An expression el ; e2 returns a concatenation of evaluation results for 
el and e2. A symbol xpath stands for an XPath expression which returns a collection 
of elements satisfying the XPath. An if clause represents a conditional branch in a 
usual way. invite and visit are used for XPath-based iteration mentioned below. 
An expression between do and done is called iteration body. 

The transformation by XTiSP is defined by changing the current node. Initially, the 
current node is the root node of an input XML. When an iteration body for invite/ 
vi s i t is evaluated, the current node is changed into a node specified by XPath assigned 
with the iteration. An invite iteration returns a collection of results returned by eval- 
uating the iteration body where the current node is changed into every node satisfying 
the XPath. An visit iteration returns a subtree of the input XML whose root is the 
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current node, provided that every node v satisfying the XPath is replaced with the result 
returned by the iteration body where the current body is changed into v. 

All XTiSP programs can be converted into Xatt*~ s . Roughly speaking, an XTiSP 
program is converted by associating each subexpression of the program, in particular 
invite/visit iteration, with a pair of synthesized and inherited attributes. 
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Abstract. In safety-critical and high-reliability systems, software development 
and maintenance are costly endeavors. The cost can be reduced if software errors 
can be identified through automatic tools such as program analyzers and compile- 
time software checkers. To this effect, this paper describes the architecture and 
implementation of a software tool that uses lightweight static analysis to detect 
discrepancies (i.e., software defects such as exception-raising code or hidden fail- 
ures) in large commercial telecom applications written in Erlang. Our tool, starting 
from virtual machine bytecode, discovers, tracks, and propagates type information 
which is often implicit in Erlang programs, and reports warnings when a variety 
of type errors and other software discrepancies are identified. Since the analysis 
currently starts from bytecode, it is completely automatic and does not rely on any 
user annotations. Moreover, it is effective in identifying software defects even in 
cases where source code is not available, and more specifically in legacy software 
which is often employed in high-reliability systems in operation, such as telecom 
switches. We have applied our tool to a handful of real-world applications, each 
consisting of several hundred thousand lines of code, and describe our experiences 
and the effectiveness of our techniques. 

Keywords: Compile-time program checking, software development, software 
tools, defect detection, software quality assurance. 



1 Introduction 

All is fair in love and war, even trying to add a static type system in a dynamically typed 
programming language. Software development usually starts with love and passion for 
the process and its outcome, then passes through a long period of caring for (money 
making) software applications by simply trying to maintain them, but in the end it often 
becomes a war, the war against software bugs, that brings sorrow and pain to developers. 
In this war, the software defects will use all means available to them to remain in their 
favorite program. Fortunately, their primary weapon is concealment, and once identified, 
they are often relatively easy to kill. 

In the context of statically typed programming languages, the type system aids the 
developer in the war against software bugs by automatically identifying type errors at 
compile time. Unfortunately, the price to pay for this victory is the compiler rejecting 
all programs that cannot be proved type-correct by the currently employed type system. 
This starts another war, the war against the type system, which admittedly is a milder 
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one. The only way for programmers to fight back in this war is to rewrite their programs. 
(Although occasionally the programming language developers help the programmers in 
fighting this war by designing a bigger weapon, i.e., a more refined type system). 

Dynamically typed programming languages avoid getting into this second war. In- 
stead, they adopt a more or less “anything goes” attitude by accepting all programs, and 
relying on type tests during runtime to prevent defects from fighting back in a fatal way. 
Sometimes these languages employ a less effective weapon than a static type system, 
namely a soft type system, which provides a limited form of type checking. To be ef- 
fective, soft type systems often need guidance by manual annotations in the code. Soft 
typing will not reject any program, but will instead just inform the user that the program 
could not be proved type-correct. In the context of the dynamically typed programming 
language Erlang, attempts have been made to develop such soft type systems, but so 
far none of them has gained much acceptance in the community. We believe the main 
reasons for this is the developers’ reluctance to invest time (and money) in altering their 
already existing code and their habits (or personal preferences). We remark that this is 
not atypical: just think of other programming language communities like e.g., that of C. 

Instead of devising a full-scale type checker that would need extensive code alter- 
ations in the form of type annotations to be effective, we pragmatically try to adapt 
our weapon’s design to the programming style currently adhered to by Erlang pro- 
grammers. We have developed a lightweight type-based static analysis for finding dis- 
crepancies (i.e., software defects such as exception-raising code, hidden failures, or 
redundancies such as unreachable code) in programs without having to alter their source 
in any way. The analysis does not even need access to the source, since its starting point 
is virtual machine bytecode. However, the tool has been developed to be extensible in an 
incremental way (i.e., with the ability to take source code into account and benefit from 
various kinds of user annotations), once it has gained acceptance in its current form. 

The actual tool, called Dialyzer, 1 allows its user to find discrepancies in Erlang 
applications, based on information both from single modules and from an application- 
global level. It has so far been applied to programs consisting of several thousand lines 
of code from real-world telecom applications, and has been surprisingly effective in 
locating discrepancies in heavily used, well-tested code. 

After briefly introducing the context of our work in the next section, the main part of 
the paper consists of a section which explains the rationale and main methods employed in 
the analysis (Sect. 3), followed by Sect. 4 which describes the architecture, effectiveness, 
and current and future status of Dialyzer. Section 5 reviews related work and finally 
this paper finishes in Sect. 6 with some concluding remarks. 



2 The Context of Our Work 

The Erlang Language and Erlang/OTP. Erlang [1] is a strict, dynamically typed 
functional programming language with support for concurrency, communication, dis- 

1 Dialyzer: Discrepancy AnaLYzer of ERlang programs. (From the Greek SiaXvui: to dis- 
solve, to break up something into its component parts.) System is freely available from 
www. it .uu. se/research/group/hipe/dialyzer/. 
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tribution and fault-tolerance. The language relies on automatic memory management. 
Erlang’s primary design goal was to ease the programming of soft real-time control 
systems commonly developed by the telecommunications (telecom) industry. 

Erlang’s basic data types are atoms, numbers (floats and arbitrary precision inte- 
gers), and process identifiers; compound data types are lists and tuples. A notation for 
objects ( records in the Erlang lingo) is supported, but the underlying implementation 
of records is the same as tuples. To allow efficient implementation of telecommunication 
protocols. Erlang nowadays also includes a binary data type (a vector of byte-sized 
data) and a notation to perform pattern matching on binaries. There are no destructive 
assignments of variables or mutable data structures. Functions are defined as ordered sets 
of guarded clauses, and clause selection is done by pattern matching. In Erlang, clause 
guards either succeed or silently fail, even if these guards are calls to builtins which 
would otherwise raise an exception if used in a non-guard context. Although there is a 
good reason for this behavior, this is a language “feature” which often makes clauses 
unreachable in a way that goes unnoticed by the programmer. Erlang also provides a 
catch/throw-style exception mechanism, which is often used to protect applications 
from possible runtime exceptions. Alternatively, concurrent programs can employ so 
called supervisors which are processes that monitor other processes and are responsible 
for taking some appropriate clean-up action after a software failure, 

Erlang/OTP is the standard implementation of the language. It combines Erlang 
with the Open Telecom Platform (OTP) middleware. The resulting product, Erlang/OTP, 
is a library with standard components for telecommunications applications (an ASN. 1 
compiler, the Mnesia distributed database, servers, state machines, process monitors, 
tools for load balancing, etc.), standard interfaces such as CORBA and XML, and a 
variety of communication protocols (e.g., HTTP, FTP, SMTP, etc.). 2 

Erlang Applications and Real-World Uses. The number of areas where Erlang is 
actively used is increasing. However, its primary application area is still in large-scale 
embedded control systems developed by the telecom industry. The Erlang/OTP system 
has so far been used quite successfully both by Ericsson and by other companies around 
the world (e.g., T-Mobile, Nortel Networks, etc.) to develop software for large (several 
hundred thousand lines of code) commercial applications. These telecom products range 
from high-availability ATM servers, ADSL delivery systems, next-generation call cen- 
ters, Internet servers, and other such networking equipment. Their software has often 
been developed by large programming teams and is nowadays deployed in systems which 
are currently in operation. Since these systems are expected to be robust and of high avail- 
ability, a significant part of the development effort has been spent in their (automated) 
testing. On the other hand, more often than not, teams which are currently responsible 
for a particular product do not consist of the original program developers. This and the 
fact that the code size is large often make bug-hunting and software maintenance quite 
costly endeavors. Tools that aid this process are of course welcome. 

Our Involvement in Erlang and History of This Work. We are members of the HiPE 
(High Performance Erlang) group and over the last years have been developing the 



2 Additional information about Erlang and Erlang/OTP can be found at www . erlang . org. 
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HiPE native code compiler [10, 16], The compiler is fully integrated in the open source 
Erlang/OTP system, and translates, in either a just-in-time (JIT) or ahead-of-time fashion, 
BEAM virtual machine bytecode to native machine code (currently UltraSPARC, x86, 
and AMD64). The system also extends the Erlang/OTP runtime system to support mixing 
interpreted and native code execution, at the granularity of individual functions. 

One of the means for generating fast native code for a dynamically typed language is 
to statically eliminate as much as possible the (often unnecessary) overhead that type tests 
impose on runtime execution. During the last year or so, we have been experimenting 
with type inference and an aggressive type propagator, mainly for compiler optimization 
purposes. In our engagement on this task, we noticed that every now and then the compiler 
choked on pieces of Erlang code that were obviously bogus (but for which the rather 
naive bytecode compiler happily generated code). Since in the context of a JIT it does 
not really make much sense to stop compilation and complain to the user, and since it is 
a requirement of HiPE to preserve the observable behavior of the bytecode compiler, we 
decided to create a separate tool, the Dialyzer, that would statically analyze Erlang 
(byte)code and report defects to its users. We report on the methods we use and the 
implementation of the tool below. However, we stress that the Dialyzer is not just a 
type checker or an aggressive type propagator. 

3 Detecting Discrepancies Through Lightweight Static Analysis 

3.1 Desiderata 

Before we describe the techniques used in Dialyzer, we enumerate the goals and 
requirements we set for its implementation before we embarked on it: 

1. The methods used in Dialyzer should be sound : they should aim to maximize the 
number of reported discrepancies, but should not generate any false positives. 

2. The tool should request minimal, preferably no, effort or guidance from its user. In 
particular, the user should not be required to do changes to existing code like pro- 
viding type information, specifying pre- or post-conditions in functions, or having 
to write other such annotations. Instead the tool should be completely automated 
and able to analyze legacy Erlang code that (quite often) no current developer is 
familiar with or willing to become so. On the other hand, if the user chooses to 
provide more information, the tool should be able to take it into consideration and 
improve the precision of the results of its analysis. 

3. The tool should be able to do something reasonable even in cases where source code 
is not available, as e.g., could be the case in telecom switches under operation. 

4. The analysis should be fast so that Dialyzer has a chance to become an integrated 
component of Erlang development. 

All these requirements were pragmatically motivated. The applications we had in 
mind as possible initial users of our tool are large-scale software systems which typically 
have been developed over a long period and have been tested extensively. This often 
creates the illusion that they are (almost) bug-free. If the tool reported to their maintainers 
1,000 possible discrepancies the first time they use it, of which most are false alarms, 
quite possibly it would not be taken seriously and its use would be considered a waste 
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of time and effort. 3 In short, what we were after for Dialyzer version 1.0 was to create 
a lightweight static analysis tool capable of locating discrepancies that are errors: i.e., 
software defects that are easy to inspect and are easily fixed by an appropriate correcting 
action. 4 We could relax these requirements only once the tool gained the developers’ 
approval; more on this in Sect. 4.4. 

Note that the 2nd requirement is quite strong. It should really be obvious, but it also 
implies that there are no changes to the underlying philosophy of the language: Erlang 
is dynamically typed and there is nothing in our method that changes that. 5 

3.2 Local Analysis 

To satisfy the requirement that the analysis is fast, the core of the method is an intra- 
procedural, forward dataflow analysis to determine the set of possible values of live 
variables at each program point using a disjoint union of prime types. The underlying 
type system itself is based on an extension of the Hindley-Milner static type discipline that 
incorporates recursive types and accommodates a limited form of union types without 
compromising its practical efficiency. In this respect, our type system is similar to that 
proposed by Wright and Cartwright for Soft Scheme [18]. 

The internal language of the analysis to which bytecode is translated, called Icode, 
is an idealized Erlang assembly language with unlimited number of temporaries and 
an implicit stack. To allow for efficient dataflow analyses and to speed up the fixpoint 
computation which is required when loops are present, Icode is represented as a control- 
flow graph (CFG) which has been converted into static single assignment (SS A) form [3], 
In Icode, most computations are expressed as function calls and all temporaries survive 
these. The function calls are divided into calls to primitive operations (primops), built- 
in functions (bifs), and user-defined functions. Furthermore, there are assignments and 
control flow operations, including switches, type tests, and comparisons. The remainder 
of this section describes the local analysis; in Sect. 3.3 we extend it by describing the 
handling of user-defined functions and by making it inter-modular. 

Although Erlang is a dynamically typed language, type information is present both 
explicitly and implicitly at the level of Icode. The explicit such information is in the form 
of type tests which can be translations of explicit type guards in the Erlang source code, 
or tests which have been introduced by the compiler to guard unsafe primitive operations. 
The implicit type information is hidden in calls to primops such as in e.g. addition, which 
demands that both its operands are numbers. Note that non-trivial types for arguments 
and return values for all primops and bifs can be known a priori by the analyzer. These 



3 This was not just a hunch; we had observed this attitude in the past. Apparently, we are not the 
only ones with such experiences and this attitude is not ERLANG-specific; see e.g. [5, Sect. 6], 

4 Despite the conservatism of the approach, we occasionally had hard time convincing develop- 
ers that some of the discrepancies identified by the tool were indeed code that needed some 
correcting action. One reaction we got was essentially of the form: “My program cannot have 
bugs. It has been used like that for years!” . Fortunately, the vast majority of our users were 
more open-minded. 

5 This sentence should not be interpreted as a religious statement showing our conviction on 
issues of programming language design; instead it simply re-enforces that we chose to follow 
a very pragmatic, down-to-earth approach. 
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(a) Erlang code (b) Icode w/o optimization (c) Icode w optimization 



Fig. 1 . Erlang code with a redundant type guard 



types can be propagated forward in the CFG to jump-start the discrepancy analysis. For 
example, if a call to addition succeeds, we know for sure that the return value must be 
a number. We also know that, from that point forward in the CFG the arguments must 
be numbers as well, or else the operation would have failed. Similarly, if an addition is 
reached and one of its arguments has a type which the analysis has already determined 
is not a number, then this is a program point where a discrepancy occurs. 

More specifically, the places where the analysis changes its knowledge about the 
types of variables are: 

1. At the definition point of each variable 6 . At such a point, the assigned type depends 
on the operation on the right-hand side. If the return type of the operation is unknown, 
or if the operation statically can be determined to fail, the variable gets assigned the 
type any (the lattice’s top) or undefined (its bottom), respectively. 

2. At splits in the CFG, such as in nodes containing type tests and comparisons. The 
type propagated in the success branch is the infimum (the greatest lower bound in the 
lattice) of the incoming type and the type tested for. In the fail branch, the success 
type is subtracted from the incoming set of types. 

3. At a point where a variable is used as an argument in a call to a primop or a bif with 
a known signature. The propagated type is the infimum of the incoming type and the 
demanded argument type for the call. If the call is used in a guard context, then this 
is a split in the CFG and the handling will be as in case 2 above. 

When paths join in the CFG, the type information from all incoming edges is unioned, 
making the analysis path-insensitive. Moreover, when a path out of a basic block cannot 
be taken, the dead path is removed to simplify the control flow. In Fig. 1 the is_list/l 
guard in the first clause of the case statement can be removed since the pattern matching 
compiler has already determined that X is bound to a (possibly non-proper) list. This 
removal identifies a possible discrepancy in the code. 

The analysis, through such local type propagation aided by liveness analysis and 
by applying aggressive global sparse conditional constant propagation and dead code 
elimination [13], tries to reason about the intent of the programmer. If the most likely 
path out of a node with a type test is removed, or if a guard always fails, this is re- 



6 Note that since Icode is on SSA form there can be only one definition point for each variable. 
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Fig. 2. An Erlang program with two discrepancies due to a misuse of the bif size/1 

ported to the user as a discrepancy. Other discrepancies that are identified by local static 
analysis include function calls that always fail, pattern matching operations that will 
raise a runtime exception, and dead clauses in switches (perhaps due to an earlier more 
general clause). For example, on the program of Fig. 2(a), given that the signature of the 
erlang : size/ 1 built-in function is 

size (tuple ! binary) -> integer 

the analysis first annotates the Icode control-flow graph with type information. This 
can be seen in Fig. 2(b) which shows the result of propagating types for variable vO 
only. Given such a type-annotated CFG, it is quite easy to discover and report to the 
user that both the first case clause and the catch-all clause are dead, thereby removing 
these clauses; see Fig. 2(c). In our example, finding code which is redundant, especially 
the first clause, reveals a subtle programming error as the corresponding ‘measuring’ 
function for lists in Erlang is length/1, not size/1. 

3.3 Making the Analysis Infra- and Inter-modular 

Currently, the only way to provide the compiler with information about the arguments to 
a function is by using non-variable terms and guards in clause heads. (This information 
is primarily used by pattern matching to choose between the function clauses.) Since 
Dialyzer employs a forward analysis, when analyzing only one function, there can 
be no information at the function’s entry point, but at the end of the analysis there 
is information about the type of the function’s return value. By unioning all proper 
(i.e., non-exception) type values at the exit points of a function, we get additional type 
information that can then be used at the function’s call sites. The information is often 
non-trivial since most functions are designed to return values of a certain type and do 
not explicitly fail (i.e., raise an exception). To take advantage of this, the local analysis is 
extended with a persistent lookup table, mapping function names to information about 
their return values. The table is used both for intra-modular calls and for calls across 
module boundaries, but since the table only contains information about functions which 
have already been analyzed, some kind of iterative analysis is needed. 
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First consider intra-modular calls. One approach is to iteratively analyze all functions 
in a module until the information about their return values remains unchanged (i.e., until 
a fixpoint is reached). The possible problem with this approach is that the fixpoint 
computation can be quite expensive. Another approach, which is currently the default, 
is to construct a static call graph of the functions in a module, and then perform one 
iteration of the analysis by considering the strongly connected components of this graph 
in a bottom-up fashion (i.e., based on a reversed topological sort). If all components 
consist of only one function, this will find the same information as an iterative analysis. 
If there are components which consist of mutually recursive functions, we can either 
employ fixpoint computation or heuristically compute a safe approximation of the return 
value types in one pass (for example, the type any). Note that this heuristic is acceptable 
in our context; the discrepancy analysis remains sound but is not complete (i.e., it is not 
guaranteed to find all discrepancies). 

Now consider function calls across module boundaries. In principle, the call graph 
describing the dependencies between modules can be constructed a priori , but this 
imposes an I/O-bound start-up overhead which we would rather avoid. Instead, we 
construct this graph as the modules are analyzed for the first time, and use this information 
only if the user requests a complete analysis which requires a fixpoint computation. 

A Final Note: So far, the analysis has been applied to code of projects which are quite 
mature. However, as mentioned, our intention is that the tool becomes an integrated 
component of the program development cycle. In such situations, the code of a module 
changes often, so the information in the lookup table may become obsolete. When 
a project is in a phase of rapid prototyping, it might be convenient to get reports of 
discrepancies discovered based on the code of a single module. The solution to this is to 
analyze one module till fixpoint, using a lookup table that contains function information 
from only the start of the analysis of that module. The tool supports such a mode. 



4 Dialyzer 

4.1 Architecture and Implementation 

Figure 3 shows the Dialyzer in action, analyzing the application i nets from the standard 
library of the Erlang/OTP R9C-0 distribution. 

In Dialyzer v 1 .0, the user can choose between different modes of operation. The 
granularity option controls whether the analysis is performed on a single module or 
on all modules of an application. The iteration option selects between performing the 
analysis till fixpoint or doing a quick-and-dirty, one-pass analysis. The meaning of this 
option partly depends on the selected granularity. For example, if the granularity is per 
application and the one-pass analysis is selected, each module is only analyzed once, but 
fixpoint iteration is still applied inside the module. Finally, the lookup table re-init option 
specifies when the persistent lookup table is to be re-initialized, i.e., if the information is 
allowed to leak between the analysis elements specified by the granularity. Combinations 
of options whose semantics is unclear are automatically disabled. 

While the analysis is running, a log displays its progress, and the discrepancies 
which are found are reported by descriptive warnings in a separate window area; see 
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Fig. 3. The Dialyzer in action 



Fig. 3. When the analysis is finished, the log and the warnings can be saved to files. 
As described in Sect. 3.3, the module-dependency graph is calculated during the first 
iteration of the analysis of an entire application. If a fixpoint analysis on the application 
level is requested, Dialyzer uses this information to determine the order in which the 
modules are analyzed in the next iteration to reduce the number of iterations needed to 
reach a fixpoint. In fact, even in the one-pass mode the module-dependency graph is 
constructed, just in case the user decides to request a fixpoint analysis on completion. 
Requesting this is typically not burdensome as the analysis is quite fast; this can also be 
seen in the figure. On a 2GHz laptop running Linux, the Dialyzer analyzes roughly 800 
lines of Erlang code per second, including I/O. (For example, the sizes of mod cgi, 
mod_disk_log, and mod.htaccess modules are 792, 405, and 1 137 lines, respectively. 
As another example, one run of the analysis for the complete Erlang/OTP standard 
library, comprising of about 600,000 lines of code, takes around 13 minutes.) 

The Dialyzer distribution includes the code for the graphical user interface and 
the analyzer, both written in Erlang. As its analyzer depends on having access to a 
specific version of the HiPE native code compiler, on whose infrastructure (the BEAM 
bytecode disassembler, the translator from BEAM to Icode, and the Icode supporting 
code such as SSA conversion and liveness analysis) it relies, the presence of a recent 
Erlang/OTP release is also required. Upon first start-up, Dialyzer will automatically 
trigger the fixpoint-based analysis of the Erlang/OTP standard library, stdlib, to construct 
a persistent lookup table which can be used as a basis for all subsequent analyses. 

4.2 The Dialyzer in Anger 

In order to show that our analysis is indeed effective in identifying software defects, we 
present some results obtained from using the Dialyzer to analyze code from large-scale 
telecom applications written in Erlang. These applications all have in common that they 
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are heavily used and well-tested commercial products, but as we will see, Dialyzer still 
exposed problems that had gone unnoticed by testing. Some brief additional information 
about these applications appears below: 

- AXD301 is an asynchronous transfer mode (ATM) switching system from Erics- 
son [2]. The project has been running for more than eight years now and its team 
currently involves 200 people (but this number also includes some support staff; 
not only developers or testers). The ATM switch is designed for non-stop operation, 
so robustness and high availability are very important and taken seriously during 
development. As a consequence, a significant effort (and part of the project’s budget) 
has been spent on testing its safety-critical components; see also [17]. 

- GPRS (General Packet Radio Service) is a telecom system from Ericsson. A large 
percentage of its code base is written in Erlang. The project has been running for 
more than seven years now and its testing includes extensive test suites, automated 
daily builds, and code coverage analysis. Since this was a pilot-study for the ap- 
plicability and effectiveness of Dialyzer in identifying discrepancies, only part of 
GPRS’s Erlang code has so far been analyzed. Although only part of the total code 
base, the analyzed code is rather big: it consists of 580,000 lines of Erlang code, 
excluding comments. 

- Melody is a control system for a “Caller Tunes” ringbacktone service developed 
by T-Mobile. It is an implementation of a customer database with interfaces to 
media players, short message service centers, payment platforms, and provisioning 
systems. The core of Melody is significantly smaller than the other telecom products 
which were analyzed; however, it includes parts of T-Mobile’s extensively used and 
well-tested standard telecom library. 

In addition to these commercial applications of Erlang, we also analyzed the com- 
plete set of standard libraries from Erlang/OTP release R9C-0 from Ericsson and code 
from Jungerl, 7 which is an open-source code repository for Erlang developers. 

In order to have a more refined view of the kinds of discrepancies Dialyzer found, 
we can manually divide them into the following categories: 

Explosives. These are places in the code that would raise a run-time exception. Exam- 
ples of this are calls to Erlang built-in functions with the wrong type of arguments, 
operators not defined on certain operands, faulty (byte) code, etc. An explosive can 
of course be conditional (e.g., firing on some execution paths only, rather than in all 
paths). 

Camouflages. These are programming errors that for example make clauses or branches 
in the control flow graph unreachable — although the programmer did not intend 
them as such — without causing the program to stop or a supervisor process being 
notified that something is wrong. The most common error of this kind is a guard 
that will always silently fail. 

Cemeteries. These are places of dead code. Such code is of course harmless, but code 
that can never be executed often reveals subtle programming errors. A common kind 



7 A Jungle of Erlang code; see sourceforge.net/projects/jungerl/. 
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Table 1 . Number of discrepancies of different kinds found in the analyzed projects 



Project 


Lines of code 
(total) 


Discrepancies 

(total) 


Classification 

Explosives Camouflages Cemeteries 


OTP R9C-0 


600, 000 


57 


38 (31) 


5 14 


AXD301 


1,100,000 


132 


26 


2 104 


GPRS 


580, 000 


44 


10 


2 32 


Jungerl 


80, 000 


12 


5 


2 5 


Melody 


25,000 


9 


8(7) 


1 0 



of cemeteries are clauses in case statements which can never match (because of 
previous code) and are thus redundant. 

For example, in the code of Fig. 2(a) if the analysis encounters, in this or in some 
other module, a call of the form test ( [_ I _] ) , this is classified as an explosive since it 
will generate a runtime exception. In the same figure, both the first and the last clause 
of the case statement are cemeteries, as they contain dead code. On the other hand, 
the code fragment below shows an example of a camouflage: the silent failure of the 
size(X) call in a guard context will prevent this clause from ever returning, although 
arguably the programmer’s intention was to handle big lists. 

test(X) when is_list(X), size(X) > 10 -> 

{list, big_size}; 

... ’// Other clauses 

Table 1 shows the number of discrepancies found in the different projects 8 . The 
numbers in the column titled “lines of code” show an indication of the size of each 
project (comments and blank lines have been excluded) and justify our reasoning why 
requiring type information or any other user annotations a posteriori in the development 
cycle is not an option in our context. Although we would actually strongly prefer to 
have any sort of information that would make the analysis more effective, we are fully 
convinced that it would be an enormous task for developers to go through all this code 
and provide type information — especially since this would entail intimate knowledge 
about code that might have been written by someone else years ago. Realistically, the 
probability of this happening simply in order to start using Dialyzer in some commercial 
project, is most certainly zero. 

Despite these constraints, Dialyzer is quite effective in identifying software defects 
in the analyzed projects; see Table 1 . Indeed, we were positively surprised by the amount 
of discrepancies Dialyzer managed to identify, given the amount of testing effort al- 
ready spent on the safety-critical components of these projects and the conservatism of 
the methods which Dialyzer version 1 .0 currently employs. 

In addition to finding programming errors in Erlang code, Dialyzer can also 
expose software errors which were caused by a rather flawed translation of record ex- 
pressions by the BEAM bytecode compiler. In Table 1, 31 of the reported explosives 



s Actually, Dialyzer also warns its user about the use of some archaic Erlang idioms and code 
relics; these warnings are not considered discrepancies and are not reported in Table 1. 
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for Erlang/OTP R9C-0 and 7 for Melody (indicated in parentheses) are caused by the 
BEAM compiler generating unsafe instructions that fail to be guarded by an appropriate 
type test. This in turn could result in buffer overruns or segmentation faults if the in- 
structions’ arguments were not of the (implicitly) expected type. This compiler bug has 
been corrected in release R9C- 1 of Erlang/OTP. 

4.3 Current Features and Limitations 

The tool confuses programming errors with errors in the BEAM bytecode. Typically 
this is not a problem as Dialyzer has built-in knowledge about common discrepancies 
caused by flawed BEAM code. When such a discrepancy is encountered, Dialyzer 
recommends its user to re-generate the bytecode file using a newer BEAM compiler and 
re-run the analysis. As a matter of fact, we see this ability to identify faulty BEAM code 
as an advantage rather than as a limitation. 

Starting from bytecode unfortunately means that warning messages cannot be de- 
scriptive enough: in particular they do not precisely identify the clause/line where the 
discrepancy occurs; see also Fig. 3. This can often be confusing. Also, since soundness 
currently is a major concern, the Dialyzer only reports warnings when it is clear that 
these are discrepancies. For example, if a switch contains a clause with a pattern that 
cannot possibly match then this is reported since it is a clear discrepancy. On the other 
hand, if the analysis finds that the patterns in the cases of the switch fail to cover all 
possible type values of the incoming term, this is not reported since it might be due to 
over-approximation caused by the path-insensitivity of the analysis. Of course, we could 
easily relax this and let the programmer decide, but as explained in Sect. 3.1 soundness 
is a requirement which Dialyzer religiously follows at this point. 

4.4 Planned Future Extensions 

One of the strengths of Dialyzer version 1 .0 is that no alterations to the source code 
are needed. In fact, as we have pointed out, the tool does not even need access to it. 
However, if the source code is indeed available, it can provide the analysis with additional 
information. Work is in progress to generate Icode directly from Core Erlang, which 
is the official core language for Erlang and the language used internally in the BEAM 
compiler. Since Core Erlang is on a level which is closer to the original source, where 
it is easier to reason about the programmer’s intentions, it can provide Dialyzer with 
means to produce better warning messages; in particular line number information can 
be retained at this level. The structure of Core Erlang can also help in deriving, in a 
more precise way, information about the possible values used as arguments to functions 
that are local to a module. 

We also plan to extend Dialyzer with the possibility that its user incrementally adds 
optional type annotations to the source code. The way to do this is not yet decided, but the 
primary goal of these annotations, besides adding valuable source code documentation, 
is to aid the analysis in its hunt for discrepancies, not to make Erlang a statically typed 
language. If a type signature is provided for a function, and this signature can be verified 
by Dialyzer as described below, it can be used by the analysis in the same way as 
calls to bifs and primops are used in the current version. The way to verify a signature 
is as follows: instead of trying to infer the types at each call site (as would be the case 
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in most type systems), the signature would be trusted until the function is analyzed. At 
this point the signature would be compared to the result of the analysis and checked for 
possible violations. Since Dialyzer is not a compiler, no programs would be rejected, 
but if violations of user-defined signatures are discovered, this would be reported to the 
user together with a message saying that the results of the discrepancy analysis could 
not be trusted. 

Taking this idea further, we also plan to experiment with relaxing soundness by 
allowing the user to specify annotations that in general cannot be statically verified (for 
example, that a certain argument is a non-negative integer). This is similar to the direction 
that research for identifying defects such as buffer overruns and memory leaks in C (see 
e.g. [6, 4]) or for detecting violations of specifications in Java programs [8] has recently 
taken. 



5 Related Work 

Clearly, we are not the first to notice that compiler and static analysis technology can be 
employed for identifying defects in large software projects. 9 Especially during the few 
last years, researchers in the programming language community have shown significant 
interest in this subject; see e.g. the work mentioned in the last paragraph of the previous 
section and the references therein. Most of that work has focused on detecting errors such 
as buffer overruns, memory access errors such as accessing memory which has already 
been freed or following invalid pointer references in C, race detection in multi-threaded 
Java programs, etc. These software defects are simply not present in our context, at least 
not directly so. 10 Similarly to what we do, some of these analyses do not need source 
code to be present, since they start from the binary code of the executable. On the other 
hand, we are not aware of any work that tries to detect flaws at the level of virtual machine 
bytecode caused by its flawed generation. 

During the late 80’s and the beginning of the 90’s, the subject of automatic type 
inference without type declarations received a lot of attention; see e.g. [12] for an early 
work on the subject. A number of soft type systems have been developed, most of them 
for the functional languages Lisp and Scheme, and some for Prolog. The one closest 
to our work is that of Soft Scheme [18]. Perhaps sadly, only a few of them made it 
into actual distributions of compilers or integrated development environments for these 
languages. Some notable exceptions are DrScheme [7], a programming environment for 
Scheme which uses a form of set-based analysis to perform type inference and to mark 
potential errors, and the NUDE (the NU-Prolog Debugging Environment [14]) and Ciao 
Prolog [9] systems which also incorporate type-annotation-guided static debuggers. 

In the context of Erlang, two type systems have been developed before: one based 
on subtyping [11] and a recent one based on soft types [ 1 5] . To the best of our knowledge, 
the latter has not yet been used by anybody other than its author, although time might 
of course change this. The former ([11]) allows for declaration-free recursive types 
using subtyping constraints, and algorithms for type inference and checking are also 



9 We are also willing to bet our fortunes that we will not be the last ones to do so either! 

10 They can only occur in the VM interpreter which is written in C, not in Erlang code. 
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given in the same paper. It is fair to say that the approach has thus far not been very 
successful in the Erlang community. Reasons for this include the fact that the type 
system constraints the language by rejecting code that does not explicitly handle cases 
for failures, that its inference algorithm fails to infer types of functions depending on 
certain pattern matching constructs, and that it demands a non-trivial amount of user 
intervention (in the form of type annotations in the source code). Stated differently, 
what [11] tries to do is to impose a style of programming in Erlang which is closer 
to that followed in statically typed languages, in order to get the benefits of static type- 
error detection. Clearly this goal is ambitious and perhaps worthwhile to pursue, but 
then again its impact on projects which already consist of over a million lines of code is 
uncertain. Our work on the other hand is less ambitious and more pragmatically oriented. 
We simply aim to locate (some of the) software defects in already developed Erlang 
code, without imposing a new method for writing programs, but by trying to encourage 
an implicit philosophy for software development (namely, the frequent use of a static 
checker tool rather than just relying on testing) which arguably is better than the practice 
the (vast majority of the) Erlang community currently follows. 

6 Concluding Remarks 

Dialyzer version 1.0 represents a first attempt to create a tool that uses lightweight 
static analysis to detect software defects in large telecom applications and other pro- 
grams developed using Erlang. While we believe that our experiment has been largely 
successful, there are several aspects of the tool that could be improved through either 
better technology or by relaxing its requirements (e.g., no false warnings), which are 
currently quite stringent. Given support, we intend to work in these directions. 

On a more philosophical level, it is admittedly the case that most of the software 
defects identified by Dialyzer are not very deep. Moreover, this seems to be an inherent 
limitation of the method. For example, problems such as deadlock freedom of Erlang 
programs cannot be checked by Dialyzer. One cannot help being a bit skeptical about 
the real power of static analysis or type systems in general, and wonder whether a tool 
that used techniques from software model checking would, at least in principle, be able 
to check for a richer set of properties and give stronger correctness guarantees. On the 
other hand, there is enough evidence that neither static analysis nor software model 
checking are currently at the stage where one dominates the other; see also [5], 

More importantly, one should not underestimate the power of simplicity and ease of 
use of a (software) tool. In a relatively short time and with very little effort, Dialyzer 
managed to identify a large number of software defects that had gone unnoticed after 
years of testing. Moreover, it managed to identify bugs that are relatively easy to correct 
— in fact some of them have been already — which brings software in a state closer 
to the desired goal of total correctness. One fine day, some projects might actually win 
their war! 
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Abstract. This paper shows how type effect systems can be combined 
with mo del- checking techniques to produce powerful, automatically ver- 
ifiable program logics for higher-order programs. The properties verified 
are based on the ordered sequence of events that occur during program 
execution — an event history. Our type and effect systems automatically 
infer conservative approximations of the event histories arising at run- 
time, and model-checking techniques are used to verify logical properties 
of these histories. 

Our language model is based on the A-calculus. Technical results 
include a powerful type inference algorithm for a polymorphic type effect 
system, and a method for applying known model-checking techniques 
to the history effects inferred by the type inference algorithm, allowing 
static enforcement of history- and stack-based security mechanisms. 



1 Introduction 

Safe and secure program execution is crucial for modern information systems, 
but is difficult to attain in practice due to both programmer errors and inten- 
tional attacks. Various programming language-based techniques increase pro- 
gram safety, by verifying at compile- and/or run-time that programs possess 
certain safety properties. This paper proposes a foundation for automated verifi- 
cation of program properties, by defining a process for automatically predicting 
event histories of program executions at compile-time, and for specifying and 
statically verifying properties of these histories. Our particular focus is on secu- 
rity applications, but the techniques apply broadly. 

An event is a record of some program action, explicitly inserted into program 
code either manually (by the programmer) or automatically (by the compiler). 
Events are conceived of broadly, and can be a wide array of program actions — 
e.g. opening a file, an access control privilege activation, or entry to or exit from 
a critical region. Histories are ordered sequences of events; whenever an event 
is encountered during program execution, it is appended to the current history 
stream, and thus histories record the sequence of program events in temporal 
order. Program event histories are similar to audit trails, and provide many of 
the same benefits for monitoring system activity. 

W.-N. Chin (Ed.): APLAS 2004, LNCS 3302, pp. 107-128, 2004. 
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Verification consists of checking to make sure that histories are well- formed, 
i.e. the sequence of events prior to a check conforms to specifications. For ex- 
ample, if the program is sending and receiving data over an SSL socket, the 
relevant events are opening and closing of sockets, and reading and writing of 
data packets. An example event history produced by a program run could be: 

ssl_open("snork. cs . jhu. edu" ,4434) ; ssl_hs_begin(4434) ; 
ssl_hs_success (4434) ; ssl_put (4434) ; ssl_get (4434) ; 
ssl_open("moo . cs .uvm. edu" ,4435) ; ssl_hs_begin(4434) ; 
ssl_put (4435) ; ssl_close (4434) ; ssl_close (4435) 

Here, ssl_open is a sample event with two arguments, a url and a port. 
Event histories can then be used to detect logical flaws or security violations. 
For SSL, sockets must first be opened, handshake begun, handshake success, and 
only then can data be get/put over the socket. For example, The above history 
is illegal because data is put on socket 4435 before notification has been received 
that handshake was successful on that socket. 

The previous paragraph informally defines the well-formed event histories 
for SSL connections; codifying this assertion as a local check in a decidable logic 
would provide a rigorous definition of well-formedness, and would allow mechan- 
ical verification of it. Such a mechanical verification increases the reliability and 
security of programs. 

1.1 Overview 

Several systems have been developed along these general lines; perhaps the prin- 
cipal division between them is run-time [20, 1] vs. compile-time [3, 7, 4] verifica- 
tion. The focus of this paper is on the latter. 

The systems all have in common the idea of extracting an abstract interpre- 
tation of some form from a program, and verifying properties of that abstraction. 
The MOPS system [7] compiles C programs to Push-down Automata (PDAs) 
reflecting the program control flow, where transitions are program transitions, 
and the automaton stack abstracts the program call stack. [14, 4] assume that 
some (undefined) algorithm has already converted a program to a control flow 
graph, expressed as a form of PDA. 

These aforementioned abstractions work well for procedural programs, but 
are not powerful enough to fully address advanced language features such as 
higher-order functions or dynamic dispatch. Our approach is to develop a type 
and effect system to extract abstract interpretations from higlrer-orcler programs. 

Most type systems predict the class of values to which a program will eval- 
uate, but type and effect systems [25, 2] predict program side-effects as well. In 
our system, history effects capture the history events generated by programs, 
with the property that the effect of a program should conservatively approxi- 
mate the history generated by that program during execution. History effects 
accomplish this by specifying a set of possible histories containing at least the 
realized execution history. History effects yield history streams via an LTS (La- 
belled Transition System) interpretation: the LTS transitions produce a label 
stream from a history effect. 
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The intepretation of history effects as LTSs allows the expression of program 
assertions as temporal logical formulae, and the automated verification of as- 
sertions via model-checking techniques [23] . Some of the aforecited systems also 
automatically verify assertions at compile-time via model-checking, including 
[3,7,4]. (None of these works however defines a rigorous process for extract- 
ing an LTS from higher-order programs; there are a few more closely related 
systems, discussed in the conclusion.) In these works, the specifications are tem- 
poral logics, regular languages, or finite automata, and the abstract control flow 
is extracted as an LTS in the form of a finite automaton, grammar, or PDA. 
These particular formats are chosen because these combinations of logics and 
abstract interpretations can be automatically model-checked. 

Security automata [20] use finite automata for the specification and run-time 
enforcement of language safety properties. Systems have also been developed 
for statically verifying correctness of security automata using dependent types 
[26], and in a more general form as refinement types [18]. These systems do not 
extract any abstract interpretations, and so are in a somewhat different category 
than the aforementioned (and our) work. These systems also lack type inference, 
and do not use a logic that can be automatically verified. 

Logical assertions can be local, concerning a particular program point, or 
global, defining the whole behavior required. However, access control systems 
[27,1,9], use local checks. Since we are interested in the static enforcement of 
access control mechanisms (see section 6), the focus in this paper is on local, 
compile-time checkable assertions, though in principle the verification of global 
properties is also possible in our system. 

1.2 The Technical Development 

In the remaining text, we formalize our ideas in the system Ahi s t : we define the 
type and effect system, and develop a type inference algorithm. We next show 
how the /z-calculus can be integrated as a static verification logic. A stack-history 
variant is also developed, by redefining the history as those events in active 
frames only. As an extended example of the system, we show how a rich model 
of Java stack inspection with first-class privilege parameters can be expressed 
via embedded /i-calculus formulae in this latter variant. 



2 The Language A his t 

In this Section we develop the syntax, semantics, and logical type theory for 
our language model, called Ahist- History effect approximation and type safety 
results are stated in Theorem 1 and Theorem 2, respectively. 

2.1 Syntax 

The syntax of the theory Ahist is given in Fig. 1. The base values include booleans 
and the unit value (). Expressions let a: = fine are included to implement let- 
polymorphism in the type system (subsection 2.3). Functions, written X z x.e, 




110 



C. Skalka and S. Smith 



C £ C 






atomic constants 


b ::= true | false 




boolean values 


v ::= x 


X z x.e | c | 


b M 


V | A | () values 


e ::= v 


ee | ev(e) 


0 (e) 


ife then e else e let x = vine expressions 


V ■■■= e 


ev(c) | 77; 




histories 


E ■■■■=[} 


v E | Ee 


ev(E) 


| 0 (F) | if E then e else e evaluation contexts 



Fig. 1 . Ahist language syntax 


77, {X z x.e)v 77, e[v/x][X Z x.e/z\ 


(0) 


77, -itrue 77, false 


( notT ) 


77, -ifalse 77, true 


( notF ) 


77, A true 77, Xx.x 


( andT ) 


77, A false ?7,A_.false 


( andF ) 


77, V true 77, A_.true 


(orT) 


77, V false 77, Xx.x 


(orF) 


77, iftruethen ei elsee2 77, ei 


am 


77, if false then ei elsee2 77, e2 


UfF) 


77, let* = v in e 77, e[v/x\ 


(let) 


r/,ev{c) -7 77; eu(c),() 


(event) 


77, 0 (c) -w 77; ey^,(c), () if 77 ( 0 (c), fj eo^(c)) 


(check) 


77, F[e] — > 77', F[e'] if 77, e 77', e' 


(context) 



Fig. 2 . Ahist language semantics 



possess a recursive binding mechanism where z is the self variable. We assume 
the following syntactic sugarings: 

ei A e2 = Aeie2 ei V e2 = Veie2 Xx.e = X z x.e z not free in e 
A_.e = Xx.e x not free in e e\\ e2 = (A_.e2)(ei) 

Events ev are named entities parameterized by constants c (we treat only the 
unary case in this presentation, but the extension to n-ary events is straightfor- 
ward). These constants c £ C are abstract; this set could for example be strings 
or IP addresses. Ordered sequences of these events constitute histories rj , which 
maintain the sequence of events experienced during program execution. We let fj 
denote the string obtained from this sequence by removing delimiters (;). History 
assertions 0, also parameterized by constants c, may be used to implement his- 
tory checks. These assertions are in a to-be-specified logical syntax (section 4 ). 
We presuppose existence of a meaning function II such that 77 ( 0 (c), if) holds 
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iff <j)(c) is valid for ?); we also leave the meaning function 77 abstract until later 
(section 4). 

Parameterizing events and predicates with constants c allows for a more 
expressive event language; for example, in section 6 we show how the parame- 
terized privileges of Java stack inspection can be encoded with the aid of these 
parameters. 

2.2 Semantics 

The operational semantics of Ahist is defined in Fig. 2 via the call-by-value small 
step reduction relations and — » on configurations 77 , e, where p is the history of 

run-time program events. We write — »* to denote the reflexive, transitive closure 
of — >. Note that in the event reduction rule, an event ei '(c) encountered during 
execution is added to the end of the history. The check rule specifies that when 
a configuration r),(j> is encountered during execution, the “check event” evffc) 
is appended to the end of 77 , and <j)(c) is required to be satisfied by the current 
history r], according to our meaning function II. The reasons for treating checks 
as dynamic events is manifold; for one, some checks may ensure that other checks 
occur historically. Also, this scheme will simplify the definition of 77, as well as 
history typing and verification, as discussed in section 4. In case a check fails at 
runtime, execution is “stuck”; formally: 

Definition 1. We say that a configuration 77 , e is stuck iff e is not a value and 
there does not exist ff and e! such that 77 , e — > ?/,e / . If e, e — ■>* 77 , e' and rj,e' is 
stuck , then e is said to go wrong. 

The following example demonstrates the basics of syntax and operational 
semantics. 

Example 1. Let the function / be defined as: 

/ = A z a:.ifa:then eui(c) else (ev 2 (c); z(true)) 

Then, in the operational semantics, we have: 

e, /(false) -A eu 2 (c); e?q(c),() 

since the initial call to / will cause ev 2 to be added to the history, followed by 
a recursive call to / that hits its basis, where event eiq is encountered. 

2.3 Logical Type System 

In the type analysis, we are challenged to statically identify the histories that 
result during execution, for which purpose we introduce history effects 77. In 
essence, history effects 77 conservatively approximate histories 77 that may de- 
velop during execution, by representing a set of histories containing at least 77 . 
A history effect may therefore be an event eu(c), or a sequencing of history ef- 
fects 77 i; 77 2 , a nondeterministic choice of history effects 77 i| 77 2 , or a 77 -bound 




